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Final  Report  on  ONR  N00014-86-K-0054 
Roger  King 
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Boulder,  Colorado  80309 

The  self-adaptive  databases  project  at  the  University  of  Colorado  has  produced 
several  substantial  results.  Parallel  algorithms  for  the  maintenance  of  derived  data  in  an 
object-oriented  database  management  system  have  been  developed.  These  algorithms 
dramatically  reduce  the  amount  of  I/O  necessary  to  keep  complex  engineering  database 
entities  up  to  date. 

Mechanisms  have  been  developed  which  integrate  two  directions  which  have  been 
prominent  in  the  database  research  community  -  behavioral  and  structural  (or  "semantic") 
object-oriented  modeling.  This  has  allowed  the  support  of  data  objects  which  are  both 
structurally  complex  and  behavioraiiy  powerful.  This  is  crucial  in  supporting  emerging 
engineering  applications. 

Also,  the  project  has  resulted  in  the  development  of  mechanisms  for  the  self- 
adaptive  clustering  of  data  and  the  self-adaptive  scheduling  of  database  updates  accord¬ 
ing  to  usage  patterns.  A  self-adaptive  approach  is  seen  as  a  promising  way  to  solve  the 
well-known  short-coming  of  relational  databases  -  they  are  typically  too  slow  to  provide 
proper  support  of  engineering  systems. 

Finally,  a  prototype  system  has  been  implemented,  in  order  to  provide  a  basis  for 
experimentation  and  for  the  evolution  of  the  underlying  algorithms.  In  particular,  sub¬ 
stantial  experiments  have  been  performed  in  order  to  illustrate  that  the  techniques 
developed  are  useful  for  engineering  databases. 

The  algorithms  and  self-adaptive  mechanisms,  as  well  as  the  prototype  implementa¬ 
tion  are  described  in  detail  in  [2,5],  and  the  application  of  this  system  to  engineering 
databases  is  discussed  in  [1, 3, 4].  Semantic  models  are  described  in  [6]. 

All  of  the  papers  referenced  in  this  report  were  derived  from  work  supported  by  this 
project.  The  three  journal  papers  which  are  referenced  are  included  with  this  report,  in 
order  to  provide  more  details  on  the  result  of  the  project.  Work  is  continuing,  with  the 
parallel  algorithms  being  adapted  and  expanded  to  handle  the  distribution  of  complex, 
derived  data  over  a  network  of  databases.  As  engineering  design  applications  are  natur¬ 
ally  distributed,  this  is  seen  as  an  important  research  direction. 
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ships,  not  attributes.  It  Wews  the  world  as  neous  databases  in  the  Multibase  project 
consisting  of  entities  and  relationships  [Landers  and  Rosenberg  1982:  Smith  -it 
among  entities.  Both  entities  and  relation-  ai.  1981).  Integration  of  FDM  schemas  :s 
ships  may  have  single-valued  printable  studied  in  DayaJ  and  Hwang  [  1984).  FDM 
attributes.  In  the  original  ER  Model  ISA  also  served  as  the  basis  lor  one  of  the 


s'  l-gwi. 

C  «  —  *3 

e--  | 

°  -S  a.  8 
§  >•  “  55  „ 


lEf.sj 

4)  m  H  "3  V  . 

-g>  •  |  §J 

Z7,3  «,  *J 

?g  &  - 

JgS 

axg °g : 

v  a  O  o  P  | 

~v  «  (_)  S  £“• 


l  ga?ii 

:  S  o  8  &  S 

i  nil5 

i  pl&l 

.  -s.2  4}  •  rt 

!  ■?  B  S  |  « 

I  3  E  S  2o 

i  !saS» 
'  «8^| 
1  *9  a?  2  ** 

.  J>  C  wi  o  C-4 

:S  a|lg!o 

j  O  ®  p  5  ot  “C 

1  S  O  ®  u  O  ^ 

s  - 1  su 

!  to5 s  s  “ 

J-8.  s.sa.s 


^  2H  .a 
>-o2  „ 
0«D'5 
•J.goe 

«..ZH 

^  u3 

cj  •  pi 

jj  «n 

n  g  p-  j 

g  s  ow 

br ^ 


5  n  i 

§■§-§§ 

!il| 

•yW  4)J 
S  O,  -cj  CQ  _ 


‘o  A  -9  V.  ?  W  _ 

-  §  g  S  -s  3 
•S 

I  “-a-o 

■  c  o  .Jog 

1  .  a,  S  O  C  ™ 

fl&Vsj: 

v  s  ?2.ow: 
8  ’>*  &  «  «n 

S  |if@1- 

.  n  1 1  >  I  a  ■ 
.2  «  «  2  ®!  3 
-9  0-5  &  “I 

•o  a  g  a  !  a  • 
?  s  p  *•  >  3 

§! i i 

X>  4>  43  T3  .2  S  , 


Q  'a  sr-s  8 

m  3 1  §  P 


-Q  2  « 

O  ^  D. 

«=.&  K 

ti-M  V 

■s|s 

o  C 

g'8.| 

?  a  o 

£  d  R 

_D  c/)  -M 


'o  &-§ 

«  3  s 
S  &  0 

si  s.g 

,  cr>  u  2 

^  ®  °  I 
1*2  8* 
brf  '3  .3 


!S“S3E 
5  §  ®p  n.  2 
’■si.So 

3  o  o.J<Q  “ 

!  "  s^t/i  t 

3  o  °'0  U  « 

’  b  w  o  '£  w 
i  ,g  >,  a  p  a> 

*  ~  co  «  -tf  > 
i  >•  k  «J  V  O 

ll-S  3  §  2- 
i-il  s-S  s 
3 |g I g  3 

>  w  a  w  4»  4i 

:ss«;s 

3^  B'SSS- 
3  a»  p  tf».SP  *- 
3  Si:  .5  0*2 


*3  O  ■“ 

t  cV 


o  .  ^7  ~0 
y  n  g  O 

V -ft  f. 


Sf  s  S  0  s 

s»  sr  b  *>  ^ 


l^&SlL 


:  j3  <«  ' 

5  -ti  <n  <d 
^  T3 
3  S 

j  an  3 
1  §  8* 
-?  s  3 

2  £  2  i» 

5  U  .  E 


•  u  *rt  • 

;a»3°s 

i  o  c  a  a  s 

i  Q.  *P  d’G  * 

'  4)  a  4)  41 

,  *  «  .2  c  y 

?  $  2  ti  £  c 

>  r2  SS.JO  a> 

liJ  g  &-|s. 
a-g  *  s  g 

2  S  ®  ®  N 

:  U  "  J3  & 

*  <0  <n  id  *■*  •  rl  | 
.  m  4>  <4-.  u 

s  §11! 

h§l§^J 

;  |  s  8 1  -S . 

£-o  s  1 

I  i.i  s  a! 


tt  a>  o>  c  ^  J  « 
:  xj  xj  ^  pi  5 

l|§a8l|J 

t  s  8  8  a  i  s  ! 

Inin; 

°  n  in  »  si  ( 

&  «  >  §  "  I 

|l6|llt| 


j  S4,  t3 

lilt  I 


S  3  S  3 
C  "  Ir 


X*r5^-3£. 

3  "  n  Q  8  | 

43  u  ®  r/j  o  g 
w  O  p.  43  'P 
B  >-  5,  n  •-■ 

.2  4J  •-*  ”P 

®{g  SS  41  ' 

a  >  3  ■§■  ••  g  ■ 

■  «  -X3  3  S  -O 


o  43  -  “  a 

8  S  &£•§ 


ntimi.n*  j»|»A0jjt  r|)°M  J°  W(M  ‘H  ®|n6|J 


[  NOflVdODDO  J 


MOJSXWOM 


W313AVUX 

SS3NISH0 


1V03XV30T 


(MouvNiisaar 


IP 

-li 


i  a's  5 

3=3  «  ft  fe  C 

•  ^  S  3  .£  J  £ 

|  sf  «  ^  .$f  « 
s-:3  a  si 

-  ■?  2  -°  a  ® 

|c aioj 

*8^  °  ai^ 

£  §  2i  a  s 

■  § « 1 4  g  w 


•"  ~  8 
■  a  •>  „ 
i  8?  -2 

1  3£  » 

if  =  |-rf 

'■8  Si  1 


V  V  u 
■CU  O 
4J  ^  Cl 

•«  a 

3 


SSlIJJdsS 

-  c  &SS  ° 

=>  «  co 
3  ©  -o  s  r?  ,y  ^ 
21?  s  05*3  « 

C-J  11  o  -* 

u  -  ji  i)  5  « 

S.S.S  X-S  &JS 


"3  3  O  4»  <■ 

:■«  STB 

i  0>  c  g  . 
« -o  ©  5  1 

-  -C  c  T3  BJ  ^ 

.  w  i3  - 

;  °  IS  C  g  -i 

!J4!li 

!  eB  9  3  s  \ 

)  .9  «j  o  H 


:g9|s| 
I  S  s  8  "  J 
;  -2  S.2H  * 

'  I  “-OT°  * 

:  g  is  p-/ 

*  T3  &-C  J 
4  5  4,  a  ©  - 

-  8  B .  I  : 
sse^i 

:  i  s  &-S  i 


ilfll 


I  l£g 


3  ft/  •  ©  o 

5N£" 

B  -'y  CO 
3  w  a  jjj  fe  " 
=  3  a 
®  e  0  “ 

Si'S  s3 

s-a -sl  a; 

if^-sl 

C  W  4)  C 

Ijll  8. 

CLf-i  ^ 

mn 

sm;  g5 

SSUi: 


bp  o'?  _*  • 

e  &P  •? 
“  -C  fa  S 

n3  cn  C 

*3  g  a  « 
0.  2  J  2 
o  *-»  w 
D--2  d  H 

2  £  8  3 
8  <  ^  8 
;o  w  a  g 

•S-S^- 

s  g“a 

^■5  °o 

Ul* 

u  5  a  3 

•S 

sSs^ 

«»T3  fl  *1 

.y  „  <3  © 
•3  -a  -g 
.9  *2  _o  c 


»)  w  C  fl 

I?!! 

-S  -I  %  'i 


"8 "I  «S 

a  g£o 

iii! 

nil 

|affl 
Isl^ 
s  :i 


iJFli 

s§  8  S  E  ^  g 

£  "U  9>  W  y 

§  a  §-g  §  |  !3 

“lag sag 
lilgsls 

g  g  §  0  S  a  » 

„  p.  .  -«  0) 


°-2-§?gI.S 

el :cb1 1 

5C-SS3J  §  8 


£2^1 

sLfs  &: 

S  a  3 


S-gswjf 
^3  roi 
3  B  5  J  < 

s»  «  2  ^ 

O  £  a  8  So 

bS  aJm 

g  «  u  — 

a  ¥  .r,  «  ttJ 

8  a  J3 
3  2  2  2  2 
a  S  g  g  0 

ill  s* 
flni 

°  i>  -a  C  _ 

s-s  i  &§ 

n««1.3 

|| 

!  S  sJSfi 


3  3  •'  S 
*  “-0  S  § 
i  8  «  S  3 

j  M  jz  O 

1  2  H  2  41 
g 

.  9  *>  oj  .  v 
M3  &«-  3 

it"  B  Q>  X 

>"  S  3  St 

JO)  br--  C 

J  §  a  u,  £  c 

2  §p  ‘a  c  £  ^ 
1  u.  Q  5  “( 
S  •  5,  C  gC 
j  g  n  g  3  ■ 
■*  3  -C 

“  «  -C  '/>  w 

3  jf  2  »  S! 

;  «  a  :3  i 


g  2  C?  »' 
laB  =  « 

2  4i  «j  c3  — 

«  *-  ^  >  S 


»-■  >%T3  OT  TJ 
>,  «  C  4)  ~T 

I  S  8.-5;=. 
51^^. 


^  4)  «  «  CO  - 

8  6-g 

to  g  srs^ 

.3  S  3 

ill!!- 


»u  2  : 

?g  £•’ 

:§gis 

1  —  V  o  -t 

;  ^  r 

gffl-S  ! 
tes  «  , 
>  -2  D  "  j 

•  So  S.‘ 

{  <nU  fl  i 
J  5  9  ; 


JG'S  u  .!" 
j  t>r  o  c  -3 

il  8g_- 

.  ,2  o  t 

,  <u  JC  “  O  L 

;  k-»  u  O  n 

tf  3  C  J-  fl 
?  .o  5  O-  a 

••  53  8  -S  S 
!0SF“: 

*•.  a>  U,  v 

u  4)  r< 

i  8  2  8  “ 

II  c  io  S  4; 

Il  8-5  S 

3  -C  ">  w  a 

s  «-S  §1. 

"  «  yCl  H 

;  s « 9 

tJB-g 

i-d  S 

SqJ.g'« 

j  w  5i  o  O 


:p  dfjM 

p  8  w-  7.  „ 

!  J3  w  U3  g  c 

;  5  .2  8  a  -a 

i  w  3r  X  «» 

I  .  B-i  .O  h*  O'  P* 
2  Pi  c  <n  .g  K 

|.2igS  | 
a  a  ^  o-g  5 

4J  c  ii  (J  ^  -O 

i&h!§ 

w  c  9 

JS  5  8  "  fi.s 

14 -c  at: 13  8 

life 

•n  a  n. 

2'“  c  >>  cf 
5  c  v  'c  x 
it  4>  C  3 
_  v-  -D  a  3 

oo  >  r  c 
<j>  9  cs  *?  « 

a  -a  a3 

l 

3  H  ”  e  ”  I 

^  C  "O  «j  to 

C  S  jo  '"  3 

►,  f]  m  kj  —  . 

3  £  “’■o  ‘3 

J  O  ^  So  D 

3  "T3  2  8  B 

=  u  r.  3  P- 


„  «  f  I 
4>  S’^i  oi 
-c  .a  c  ^ 

lags 

ti§£ 

d  jSm 


t  VI  ff 

_gc& 

s-g|  8 

■5  5  C  s 

g  a^-8 

.9  8  §  g 


BM,n,rw  J»|»Aitjj#  p|inM  jn  muijoil  jo  uni|ntiiAs.-ml»i  p  .j|  g|  sm6y 


W5W5- 


jO 

.53 


•rl  S.-9 

Si  ^ 

u  2-sl 

.an“l 

«  a  a 

S«  W  ,2 
£  -a  </i 

Q  o.“  « 
03  £  2  -° 
W  e  S 

4*|I. 

_j5!  jj  s 

|85§ 

C  k.  , 

5-S.o 


J 


t3  g  L 
CD  o  ”  O 

§1^1 

S'sl  g  I 

a>  o>  o  o 

•  E  Q.  u 


Cx*  CO  t3 
CO  4> 

s|| 
§q  ® 

co  <  .2 


.52  .i,  «  si  ' 
55  _a  w  c 
e  w  «  c  o 

5  0  .  ?  o 
.2  a*  „ 
£  aZ  «j 

*|  I|| 

6  jSg| 

jj  »-a 

„  Q  C  §  «3 
-Cj  .3  o 


B  & 

•i  c 

r: 


C  «>  <n 

».2-S3 

g  a-^ 

§3  °  0 
3  .&  t  „ 

^  §  Is 

•3  B  o-S 

.^ag-s 

i-a  g'f 


W  >/£  5  JD 

C  4)  5  t,  P  rt 
^  -c  rt  cj  'w  u 

1  “-“-s  °-s 

~  «  rt  R 

U-S  » o 

rt  c  o  - 

„•  .5  £•  »  “  P- 

2  »-  &  5 

n  d  a -5 

to^  2 
.  B  *3  .  „ 

S-SS  S -■§ 

P  S-C  5“  c 
t*-  S3  y  w  «  p 

■i*  s§  £  a 

-s «  o.  u  g  § 

*2  o  9*  m  °  *^ 

>  a  ™  u  ™ 

fS  &l'sll 
i  9  2  g 

d  _2  ~ 

•si  s  &| 

*5  3  o  5 

3||.i8 

CQ  «  d  «-»  — 
(O  k  _ 
~o  , —  ..  •& 

9) 


o  —  -5 


o 


! 

f 

i 


>  » 


•  g  3-°  jj  8 

«-J  «  S3 

i  |a  *•  |fc 
i  9-  3  a  0 


»  ri-'S 

ilSH: 

s  s  §  8 1 1 

§|ai!4 
3|=slf  J 

a  Si  |  o.g 

f|3a&8 

-  m"3  jI!  s 

» s  gri  e-2 

latZla 


u»  *-*  ,  • 
n  O  “ 

o  c  .2 

B  ° 

7t  « 

•  S  a 
•3  ■*3 
|js 

^CJf  u 
.  9.  ° 

Sli  o 

o  r; 

Is  a 

§■  ss 

-  § « 
♦J  — *  D. 
•—  V) 

JS  t!  w 

.a  si 

I  &  1 
*  «  3 
-•at 

<u  _o  o 

"8  a  S 

5- -3  3 


i:1  °3 

<3  S  ^  «  « 
1  -  «tj-B  “ 

IllHj 

i  .}S"« 
•  co  —  c 

:  5  K  ,3  -3  -a 
!h  sSu  S 


h|1|1 
'  a.  s 
=3  S  K  8  3 

jlfii 

.  y  13  ^  to  a  1 

.§  Sa3  8  1 


3  0-4  C  ' 

g  s|  §  i 

§31 
*3  £  S-S 

3  -  ||  | 

O  a>  c  ^  H 

4-1  TO  * 

S|x-S| 

-Mil 

o.  c  o  a 

u  TO  O. ' ** 

s  s  i  &-3 

B  ’S  -I  M  -a 

s  2 1  -I  -■ 
I1 3S| 
S**S| 


|-3° 

S.-2  2 

9  &S 

■5‘gg 

•p  c  & 
g  u  a 

ra  «  « 
tn  *g  4) 

d  -d 
O  V  •-* 


13  §  'o  g 
|  .§  g  g 

.3  g.2  5 
"*  §  2  -p 
-p  *o  g 

q  tr 

«>  o  .to  n 

>  &"2  4)  ^ 

o  5  Jj  .2 

£  to  73  o 

•EgS5 
■5  >.  1 1 

w  2  ^  d 

&  §  8  g. 

2's-a  I 


*£§§ 
J  a.3  0 
o  p  °.n 
3  fc.  P.  o 
to  CU  O  in  • 


"7"^  « 

a  2  3  B  5  ! 

•1  to  4>  TO  1 

1 1  S3  a  8  ! 

31-2  G  S 

[!  r  E  3  J 

l7§5£  - 

D  ^  „  P 

!S  1 1  &  g 

3  to  *  2.  0. 

§S|SB 
;  u  e  >> 0  - 

5  “  o  «  S  , 
l  8  0.  5.1- 

sS-S&.s  ' 

Iff-P  . 

D  <31  O  TO  4->  - - -  , 

3.  v  .§  T3  =  S  - 

3.  br-2  q  O  cr> 

fl  TO  >  CO  0  —* 


0  <  i«  >  1  0  1 

_a  C  4>  :P  .Q  n 

•-*  0  -S  c  3  “ 
u  q  5  <n  .  ' 

B  c  w  O  1 

5  4)  to  p  .2 

]s5sl|; 

o  o  <u  oh: 
^  *  *0  to  *-*  «  o 

a  -  2-0  S 


8  E  d  •  ; 

3  .s.3  5  ! 

§  g  gi 

s-H  9  <1*o 

MllISS 

-0-2  5  t  P  2? 

--si  %i  h 

C  5  g  -2  "3  0 

|S|  “  |  s 

8  o  -3  o  -5  S  1 


5  3  S  g  1  -S  3 
.  0  s  S  ™.|-s 

4  ^  H  <i>  ^  2  o 
i  5  to  o  $  a  o 

3 1  a*«  a 
isr:  si§5 

5  i  b  3  s  -a 

;  « a  S  -  3 

<  u  s  a  «  -  o 

5 s |  S § 

-3  «  «  a  5 

j  a  e ;  =a  §  s 

•C“9  co 
.-0  J«3»S 

I  «-8  §3U 

?  «  3  p  E  S 


\  C  O  p  -J 

^IJSl-3-S 

Si««2 

crt  2-  3;  TO  CtJ 

®  4)  Ci'  -a  a. 

K  S3  -.8 

S'  S’  o  c  »-» 
tJ  w  «>  o  2  •« 

g- 1  i  2  I  C 

5!  -o  to  q  ®  o 

4J  Di  J5  H  w 
TO  V  M  N  9  C 

.2  3  "d  -3  2 

W>T>  ^  a  o  H 

g  4,0  rt  S  ^ 

fi'Sfe  si  9 

£  o  P  H  £  a 

S-c  ^  - - '  R3  "p 

CH  =  it 

^  p  .9  §  a  ? 

«  ci  $ 

_o  w  -a  —  .2 


s||  a 
1*38 

S  S  §  g 

iM 

3 

w  TO  br  p 

3  'S  S  “ 
t  *  C  s 
s  a  2  -5 

3  m  J  3 

JO  &  c 

“.a  §5 

a  s  Mw 

3  0  yO 

«  5  2  ‘is 

rj  TO  a  o 

5  k  §  e 

C  Q, 

6  9  §  3 


a -a  5  3 

a  s-9“ 

Ills 

m  S  H  o  . 

«  M  —  s  - 

TO  fZ  TO  4) 
b^  .  bT  OT 

a  C  - 

f  1  is -| 

1 1 8 9 1 
3  « «  2  a 
■91-1  q  2 
c  £,  &I-3 

4>  M  o  4» 

fill  | 

0  M  &  B n 
H  3  Si  o  S 

SJ  jo  H 
to  u  tr  u  oj 


•sssrg 

•8-S-S^  | 

|  $  a  2  c 

hr  C  u  o 

-2  TO  TO  P  ^ 
q  &  0  55  8 

5  5  5  c  " 

fl  O  2  o  « 

5  ra  o  _3 
3  j;  w  “  . 

oS||.S- 

6  «  0^  S' 

5  3-S  g  ST 

do  dp 


3  o  8  31 
2  5  s  0  &. 

o  >  O  q  —  I 

«  Jf  E  3  0  ' 
d  q  “  0-c 
S  1  &  K  «  1 

8  M  0  tt  3 

4)  ®  >  Q.  T>  1 


P  *4»  TJ 

ill 

S  0-3  ■ 
-:a  a  f 


I  _  u  tl 

0  g  8  S  s 

^  CO  11  g.  5 

.9  o  S  ^  i 

I I  s  i  g  1 

I  w  S  OT  2  -i 

g»  >3^3  * 

i  a  1^.9  i 

'  :ii  |1 
fiai  |J 

S!  J  hr-  ' 


iJ*4*-*  .  TO 

•P  ®  —  C 

®  u  a  o 

s>— ’  43  o  a 

3  --5  s 

51  5 
i  S  B|  § 

S§g| 

»  8  TO  2  J 

i  Cf  ^ 

5 e 

illSl 


;  » "2  ^r/) 
!§S-°3 

Sfj“- 

.3  *id  ”  a 

?c  c  g  I 


2  *0  £  A 

TO  r;  V  C  , 

-S  TO  g  o 

-d  i  ^  TO  - 

TO  >3  £ 

a  •»  .8 ,7;  ■ 


5  «  TO  . 
<  b  ^ 


%  C  ■=  TO 

So 

f  S.J-S 
lB6| 

•g  ™ « -g 

9|lo 

1|1| 

fill 


|  °m 

|!  5 
8  SI 
2—0 
-  S’  c 

}£  TO  jn 
O  a>  ■ 
c  br-P  c 

si; 

•)  fe  ®  < 
fs|c 
-s  .5  5 : 


t  c  .0  O 
0  2  c  <r 

Q  l-j  U  ,  Cl 

*  5  «  8^ 

“•  q  -o  0  3 

g  Jj  t3  "2  ^ 

g|sS£i 
-l  HI 
2  i-s 

• — •  -O  n  in 


;  "3  tig's 

!§sl| 

;  o  u-o  a 
{  ?  .2  °  3 

I*  a 0 « 

l  q-2 

J?  ^  g  1 
■  J  °  *  3. 
F-S-3  -  g 

i  0  c  c  5! 

•  a  a  gig 

jllis 

rl«jrl 

S  TO  "TO  o  5 
J5’S  §  S 

■*-'<;  0  o 
o  *ri  t*  <j 


S. 

T3  w 

0  S  2Ld 


E-g-g 

2  m  0  cq 
0  -a  „ 

?  s  3  s. 

§lsl- 

V  .TO  m  TO 

D.jO  O  J 

"•  O 

M  E  O 

5gh 

JI  U  “ 

w  -o  c  S 

TO  To  b)  V 


B2  r: 
5!  oi  8.  S . 


’■2  jog 

J  -  3  3  i 
-  ■  »  |  3. 
;  -a  S  o  6 

■5  «  “  ifl 

»  P  P  O 
3  0  J2  ^  2 

3  ^  W  S  -rj 

1  .  S  p  * 

:  to  ^  13  c 

j  hr  c  >  o 
:  c  -  «,  a 

;<ngj  c 

,  1.  w  tfl 

1-2  2  S  „ 

TO  g.TO 

j  B  <3p  c 

3  m  C  O 
J  TO  <v  «»  3 

^  2  P  TO  TO 

{  9  -O  S  TO 

3  ri  C  TO  H 

3  a  p  w  bt 

2  91  S  E  S’ 


5  TO 

S  ^  5  c 

o  £  - 

,'gl-S  2 

ST  &  B  a 
3  0 


c  TO  TO  10  < 

oS£S' 

•o-H  g-i 

M  CO  C 

01  s  8 . 
Sc01-. 


8  «  2  $  5  -5 


r-  8  „  &  i 

c  s  3 

o  s  3  a  s 

§  -  ic  ]5 ; 
O  *  Sp  1 


:o-Ri 

3o  £  ||; 
3  .&  5  0  o  ! 


5 « s  1 1 

^Eial 

ttm 

iSsag^ 

.  g  c  e  « 

n 

IfsIS 

nil* 

2  8  1  o 
c  *5  «  0  o 

i»|IS 

sin! 


isMJSi 

|  3  J  0  g  S'5 

J  U"S  TO  -O  2  ! 

.sslll1 

1-=  §  “  ;  a- 

8J!£hv 

■D  B  K  J  J;  §  i 

siNi*  i 

iiilii 

0.  S  S  TO  3  “ 

is3lcl£ 


IgJ-q-s 

i  S  SI'S 

!  TO  >->  TO  rt 

I  ft  t  C  «. 


oi  r  ,  10 

:  Sr  a 

•Is  -s 

5^2 

3  TO  C 


!  2  O  '-1  to 

:  -2  «  -m  « 

>  —  u  c 

I  3  &  _D  TO 

i  1 1 5  -s 
I1!!1 
jSsIf 

3  &  g  5 

1  ffg<9 

:  ?S  d  2 
;|  B  2 -a 

i  a  2  a 

[a|li 

mil 


c  -d  i  «  -p 

g  a  s  :&  a ; 

i  jjp  fl  *p  ! 

^  Rfp  ^  .§  - 

q  s>  q  -5  j  5.  ] 
w  0  3  r  B-S 

°5|  S&J 

0cod—?- 
r.  o  0  0  c  •— • 

ds -a  «;  8  5- 

«  3  -  1'2-SI 


mm 


j  W  0  c  B-S 

I  °  S-g  | 

i  |  g  8  l»t 

ds  a  "  S  .5  - 
« 3  c  o^l 

ih£:« 

it  llgli 

II  8 1  il55 

isie^l  i 


^0-^  6-g 
•a  i  £  9  >  s  e 
■B  0  «  3  3®|  S 

o  —  g  q^-g0-  "i 

r-g  §  S' 'to  §  ^  3 

S  C  2  ■§•§  s  .q  . 
2l||l 8| » 
!-  i  S-s  9 1  8 

1"1 9  i  •?  1  § 

»^n*d®^dTO 

all  a  S  §  S e 


"  S  2  *— *  - 

2  C  hr 

0  to  c  • 
c  55  S  B 

TO  4)  . 

TO  _C  -G  ' 

o  o  »-  *”* 

TO  ^  ^  C 

3  io  hr  o 

J2  c  c 
O  o  O-J 

SS^c 

c  E  s’s 

o  TO  2 
Q.  TO  'S  _ 

S  g  S 

*  1 


o  o  - 
u 

a  -S  c  3 

■Q  q  ,S  "3 


^  “  w  {J-C  O 

3  S  2  TO  TO  »J 

J  -8  3  >  3  d 

:  .0  '  13  0 

> Ug«>3 

i  0  S  0  >.  0 

:"«PcS 

>  >s  «  S'  TO  TO 

>  .2  o  d  «2 

3  •-»  ^  c 

^  C  to  t2  hr  o 

i  to  hr  m  c  3 


^  C  to  t2  hr  o 

!^=  =  s 

>N  4-  BJ  4>  _TO 
3  .2  L>,  O  TO 


to  *-*  k  o  y 
0  2  „  5  • 2  2 

iim* 

i»ph 

to ’d  S  —  w’  1,5 

d s  °n  § 3 

w  l.  b)  rj  ‘2  *d 

s  s  1 1  a  e 

J  U  0  0  u  u 

.A  n  rn  OI  C  frt 


C  >  •• 

6  s’a 
%  h.s 

c  ^ 
g  hr  z, 

?  TO  TO 

-sg|j 

«  P  c 

OJ  -O  - 

u  O  r* 

III 

g.  «°> 

V>  xj  -p 

ill" 

0  C3 

7  |1 

C  0 

3  ^  B 


;>t*' 


■"iWl’iJ'i 


c  at  -c  "5  o 
—  w  o  - 

4j  o,  -C  5 

I  S-  3  1 1 

|  s  S  s  s 

c  P  O 

w-n*^  ”  ° 

G  «  3  n  2 

II  "jj* 
S  s  s  i-  s 
s  «*£  a 

o  $  .S  O  -2 
°  c  3 

C-^tJ 

o  o  3  « 
»,-0S.S 
S  *  o  g  a 

Hill 

bill’s 


~“|J  « 

■2|-a|  s 

lisle 

lfS°* 

8  i  S]p 

ai.g  9 


at  v*q  '  «5  at  c 

:*  °.£3  B  S  | 
-  t-H12  5  S' 
J  %  =:  _c  a  “ 
cr  g.  "t.  o  5  « 

^  §tS 

„  e  o  5  »  0 |  - 

S -s  "3  i^| 

®  o  9  -a  2  : 

Es«„aS§ 

1-3  ?1°>S 

*  o  -3  3  S  5  u  ■ 

*  9- -3  v  J  5 

■8  I11S. 

E  a  <n  “■£  3 
a-S  g  “c  0  | 
”  fcp  3  2  S-2 

llJSllJj, 


MM* 
at  d  C 
-a  ®  at 

>§g“ 

S*J 

-O  u  g 
“  «l  o. 

s-s  3 

g  J-2 

^  *  < 

i  1  s 

•S  II 
8  s  e 

5  «!  *1 


iiS|.§8- 

*43  a*  -O  o  '** 

[•o  ^  p  c 
’  ja *“  c9  § 

j  a>  rt  O  -c  P  " 

i'g  M  5  W)  S 

D  t9  «  C  *° 

8  sS  c* 
j.HS  «8 
j  q  3  «  u  Si 

!  I  2  ST  g  * 

B  at  .o  “  -£ 

,  at  t:  8 

"  —  C  £  c 
:  c  ^  o  o 

-S  S  H  &'8- 

!  «  a  3~  a 

’  «  c  s-s-° 

M  <U  40  VJ 

>  s> « **  §a : 
1  S  o.S 

•2  □  4>  CQ 

a.s  s^q 

r  a>  -d  w  at 

I  1)  o  -  p 

'  H  c  r  P 

!  -5  .2  w  T° 


H3  §  a 

a>  s  B 

»  C 
'I  3  > 
*  2  »p 

O  in  g* 


2  o  w  S 

|  a  "5. 

a  u  *-*  & 

iPi 

laSS 

It  'd  ^  "O 

>  o  >> 

“a-3  8 

li|S- 

«  3-S.g 

M  0  -v  o’ 

guSiS 

.  eo  rt  m 

|1»P 

Fill, 


J  P»  U  -C  M  *1 

3  3  s  i  S  1 

>  « <e ,,  — 

3-  .  in  W  v  H 

2  £  IS  2  -S  H 

i  -a  P  3  _  ~  5 

J  ®  J  7,  A  « 

i  4»  ^3  f  ■•  o  C  o 
'  o.  41^  o 
-  ^  B  *  c  c  V 
5  o  g -g  o  5  be, 
J-jfi  r  "  a  Z  J 

J  ?  r  ?  ^  m  : 

5  gJ-2  S  g  g 

i.s-°  »  •  «  ~p 
53  R-,^  r. 
:  c-c  e  c  s  h 

,-  Jodn  tn 

l'S»"SS  5 

liiii  1 
l?i.Si£  I 


!  go  g  I 

i  1  sr.- 

!  9  S.§  ^ 

,  B  3-“  g. 

I  ^  .a  &  a 

itS-r 

!  a  &S..S 

:  c5  d  ^  . 


-  «  «  K 

»  a  3  ^  c 

1^*3  §  « 

■  s  1 0  3 

I  aj  H  U  J3 

I  “°  -a  §  3 

I^S-S-S- 

I  '*-«  <♦-<  I 

ca  in  O  O  C 

u  o  w  4 

l!  3  11 

«  -8  5  3  ' 
a-o  >  &  E 
9  £  a  3  t 
.a  “s’ 

C  o  «  ,a 

™.  §  3f5  3  .0.2  | 
^  ^  “  g  |  2  I  | 
~  |  3  S'3  f  M  a 

jjll-sj 

8  He-5 
"8  I  I  £•; 
a||-  s 

•q  d  S®  «  i 


j  o  S  w  c  o 
3-*-  8  o  o  3  o 

3.sf»S0* 

rBCQ-0»u.a>> 

L'H<utyf®Owl<u 

J  a  c  5  E  S  s  0 

3  6  S  !  «  £  .5  -2 

al  II I  C-S 

^  4>  «  .5-5  g  «  3 
J  d  n  S  4i;2  3  u 

3  fl, p-  2  *3  o.  c 

r>j2„  - 

*  rt  J  41  W  ^  u  m 

!Siq*-*SS 
;b525cC3 
3  —  2  >  «  2 's  f 

ja>2  2a>tJ°r 

i-o  S  Si  y  v  w  . 
j  o  c  at  c  br 

;  I  s  g  c  v>  §  c 

“5  S3  H 

i  S  a  s  -  ®  s  “ 

§|  s-ga  9  s 

§  e-s  s11-  go 


c  0.0  c 

W  O  *-'  r 

0  5  in  • 
a>  >  v>  c 

«n  a>  <u  o 
S'°  Sf'g- 
5f^  &  S 


2  €  &  3  . 


5H  E  &| 

c  bf-5  ' 


°  at  *P 

atrC  ” 

3P  g 

^3-0) 

saM 
:  m  g’.l1 

la'a 

Ic-S 


8  B  « 
§  §  & 


I  OTl  41  ><  O 

:'gg5|o 

:  2  1 

n  in  a>  o  w 
,  4)  a»  c  in 

|  "  &  o  g  3 

>  41  M  J3  U  n 

.  O  2  -3  c« 

!  M  -a  §  I  5 

■5  „  B-o  3 

j  3  I  3  o  « 

1  o  d.|§  n 
ffcao  S  '2 

fts  g  3-s 

f  B  »-0-5  • 

i.s  hss 
3  &  °  -3  ^ 


n  41  u  m 
^  CP  tO  h 
&  <V  .^_Q 
45  ^3  (0 

i!  °! 

uSmC 
c  a  at  o  r 

“!ac‘ 
iggo‘ 
3  8  8  E  ' 

q  «  — *  j 

Sgi 

C  V  s  P 
P  C  rj  -C 

1  2-3  e; 

0  x  a  «  f 

2  11  C  4)  I 

&1  ^  5  M  ! 

I00<i 
•1  i°-] 

q  a>  0}  J 

9  1 


o  0  "5  p  r 

t>  TJ  ^  a  bJ' 

11°  §1 
c  w  esp 
•-  03  LC  2  0 

J>  S  -O  C  ; 
S  ®  O  O  >. 
«Q  B  »  a 

«  w  O  4)  4J 

a»  C  ‘S  *0  42 

p  at  c  5 

- 1 1 S  g 

1*!g  a 
C=’o  M  s 

SS|SI 

I  c|  a  § 

i-SSjg 

O  o  O.JS  K 

3  2  »  J  B 

H  s-S  §  8 


ist|  ®Dnj»jaj3J  aiji  01  paXaq  ajn  amJlij  tup  411 
v»3Uai»j»u  '»9p3)Monq  (tioipn«  01  povoddna  ion  Xi;|jrjm1«3  aisotpui  «qiri|g  i^apowi  MUttraw  uo  pa«»q  «»JI*n^u»|  »ft(j  -0i  •infl|;| 


IftOOvI  JOOO  JM1M3VH  «VA/XtNn/wA 

on  tvo 

ON J  1  NOW* 
lyNouvun 

/  /  Of/.  «*A 

JOvnot/yi  /  /O'  ON*  *5yf 

jjy/MUNi  /  /  Oonmm 

IVNO'lV  1  )u  /  VfNO  /  9}  1  MOi  ttuf 

ON  1 1 NON  4 
HWONt 

WlINfVOmVMW  X»A  NOI1V1HHOM  NH5 

I9VNHM  WO* 
joynowyi  vivo 

I 

|  OMNJHlum  IOM 

|W»9»HlWl>0»va/*tWVXyA 

ONJ INOW* 
IVNOItyDU 

lUX'wl  utvooo  tyDS»a/*HA/xyA 

1I00H  vivo 
lyNOtlOMO* 

Uf*03t  yoy/tH/yyvA 

NO'SNJIXI 

M91WV0V 

•*  jAiivMjjwi  «»iiro  istoovt  Mfrs 

NOISNJIXJ 

fntoowa  itnnujiiv 

1VMOHVUH  t!4A18‘  'S/JdAl  |S«>*W|  limy 


NOtSNJJXI 

smnoivn 

IVNOUVDU 


HNlK0J»t03 
vmiQfOfAHUO 
t  f'  r  f'  HjtM  IAUVMWWI 


JI0VO0NV  1 
ONinnvuDotM 
lyMPIUNO) 
HOtwVtti- 

in  ino  10 

1VMOUV1IW 


IMIM 

«I9 


IIWOIW  I 

OfMRVil  ]  lIKVl 


HO*  j  nr  til  I  VH4V0 


9MV1  -lawi/fotamoi  HI  19A« 


Iff!  ft/ 

~fh  sh 
fWS  i  f 


IIOOH  h93NSU141>4 


i*  *  $  s 

ri  i  /V 


irvirti  wv  m  v 


|etj  Minjiil^t  III  cnt;|n,i  r»p.-»-»j j(| 


IHW)I|  33V3W3JNI  3IIMVU9 
Bovnowvi  AU3HO 

3tt|Nn  /  OIWVOAJ  VVA 

______  /  N0IJLV1SXUQM  Nns 

4.3*00104 
0H033H  WH03INO 
NON  1VNU3LX3 

(S8WV4 

Kb  4031 

NO'JVlN3W37«/WI  O3in0tW|SJO 

ONiwwvHooMd  iso»«  Vt  aJaaiSwi 

V0W5WA/XVA 

24A10OS  MO 
3«#AX  W3d  3  1*3  SWA 

ItB'1*?!  30vnt>NV1  AU3MO 
— — - — — — — — - 

*wSwK  /  AioS 

1VN0UV13M  /  ytNn  J  331  NOIUUQ 

N0UV13U 
H34  34A1  0N3 
IN0U3  1VN0UV13U 

flB'MSl  30VO0WV1  AUjno 

1091V  l’msiStll.l/SHft/XVA 

34Aliins 

HO  34A1  H14  310 

9007V  3r/3XSfSW?tf 

.... 

W03 

Iren 

(9930111  A|n»0v^v3 
sDvnoNvi  9N1WWVU9o»M  HOi 

Wivosw/xiwn/xvA 

N0HV13H 
H34  34AJ  0N3 
1N0U3  7VN0I1V130 

SUVA 

W31SAS  03iv»5osSV 

0WV1  NOUViN3W3lWW«/SO/3NIH3VW 
W31SAS 

.  A991VU1S 
NOIi.VJ.N3W3  MWI 

7300W 

S13M3HJJ31I 

•i»rni<*5 


p  in  ^  u  o  f 

S  s-p  §  S  " .! 

ill  «*l! 


S^-S  "1- 
8  5  s  S  S 

«  fc  -a  v  £ 
■-w  «  >.'5  I  “ 
§  -g  s  2  c  "g 

3“5j„S 
|  I’a  S’0 

g  4)  O  tf  ■? 

O  3  E  H  » 

Job  al-“ 

M  "(5  3  <$ 


gl§si.2§ 

P  E  -fl  S’  'S 
,0.5  89^ 

3  «  §-  s 
"  „  g-S 

!§*§£! 
#  H  4)  Ml  M 

e  *  S~-9  'P 
S.[g  Jg  »•« 
•§  .  S  3-S  .s 

H  >  O  S'  S  "9 

.s  £  3?» 

a  J  «  |q-"  d 


acic.S'Sj 
322.S|>,S| 
^  w  ?  y  y  -  • 

’“IsMi 

Eo-S-gf  gl 

Ss^Jf gg 

"cc4§|5 

§i?y  0-3 

a  6-g  « a  ^1 

•I  i  ®  1 8-, 

~  &S  JS^-S 

-118  81.9 


Iir|l3& 


B-9i-9Si33-s-g 

gs  Kwg  K  "  *  §  s 

|S|3f  I§|l.3 

-S--  s»> I  sg 

-  c  c'jiu  »  »  aS  » 

j  Sm  S-s  Il 

.3  a] 

■? 

3  5° 

S  '  0  V®  S 

a§ 

c-a  s  0.13  d  0 

sfels.pg 


T3TTJCV WC7U*  rWJ<  -T- 


tfJOB-Tj-riuiajcOO*'? 

E  >  c  c  ‘-*9  v  -  £ 

4>  !S  O  o  H  O. 

“  £  £Wui  a  H  .3  9- 


iS  J“"05W  - 

a  o  .  a)  t)  2-  >,75  i 

§>«  S^-g  £  s  §"  .  | 

g.S  O  o“ 

^|-g3U|-g!-oo' 

-  8  « .2  e“m<2-g-S 
£  £  -S  g-  2  S  .  3  - 

i si  8  3«|al|- 

&-oSI§Jl-sss 


)CC.g  o. 

-  o  o  i*  a> 
x  3  '3  >  u* 

i,  ills- 

1-3  B  8-3 

u  _  5  73  5 

!§&§§« 

5  jC  4)  a>  . 


1^1- § 

.  .s  .5  “J  a 
S-g-g 
!Sess 
5  S-°£2  i 

i|5  s  a 

8P  8 


J  ~  5  r  ^  ? 

i  3.Sf  s  s^l 

!  S' si's  «  a  * 

u^r  s-s 

>41  .  >  T3  >.  p 

>  c  w  ?,  4)  *a  <«  c 

>  8  c-sl  •  2 

s°l °  a §"l 

J  &1  “.  a  fe  >q 


o  S  g  .£ . 
g-SS“: 

l-  w.  4i  (p 

S  s  «*- 

e  :g  "3  W 

S-lg" 

■s~  a  5 ' 
.§  c 


>,  a  ^ 

«  M  4;  - 

0  4>  >  -C 
°  <V  £ 


!&1 
|I1 a“ 

3  2  .2  m  E  ^  2  11,0 

s  »  2  &  5.K  * 

2  C  U  S'0  g  '—  -Cl  g-  g  w  Q  v> 


S  wtU  w 

fi  hQ  & 


9  .4*  "a  *ts  ®  j  4)  «j  -a  4> 

,S  5  °  a  a -2  s  a  «.  c  > 

S js  I s-S  |.|a  a  g  * 

i  u  O  S  ««  «  fl  R  fl  w.g  S 


*  $  O* 


■3-5  Si  *  STS  JS  *.2  a 

!|s|I|«f-S  p 
•2  §-  «  §-|s  2  g 

a  ^a  «  :s‘§f  *-s  s 

s « i-s|  :l  fcf-s  | 


E  3  S  3  «d  a  o  u  bfo 

O  (Q  D  r!  *B  i*  *0  3  3 

S.-3  B  «  «f  g  S  S  “•  g  g 

«*-•  rj  1  rj  p»  5  ■  -*  a;  »-« 

Klsililil! 

i&-8  5fig^S-S-la 


p  .  --  w  ._;  n  m 

"2o«  s"2 

a^S^a-bl 

°  o  S^.2'5' 

i  ot-  q  o  u  a  C-3  g 
[■p..  SuS-Sti? 
iow0fljcjga.y^ 

illiHiu 

|;J  ill 

!  “  al^aW  Sq 

'  2  o  2  O  «  Sp'°  * 

!  3  *->  ^  I>  S3  •—  C  4)  to 

!1f 11 §11 


-ag-S^ 

§2.9^. 


1 1  «  §  s  g  ^  s 

»  w  41  O  «  .C 


4,1  ®  O  73  «  -C 

'5  "  8  .5  >.  * 

s  i  “  «.«  -  h 
«  E-Slo 

B  *  1-5- 'S  3  5 
,  c  S  |  g  g  g  § 

,  °  g-i  5  s « _ 
S’?  g  B  IS  § 

S^a  “I J 3 

_«2  #  •«*  co  ?  w 

»  2  i9  "S  o 

T3  C  ’O  u  w  n 

i|-a§js‘3.8 
s  8-s‘S  g '°  a 

I  g  „  1  i  C  S 

•s  E5  S  §5  g 


«°ef  5j-2 
■°S  "  o  c  “  3 
%•'"  oM  £  e  g 
e  g  fr ici a 

.  fl  4)  i*  o  o 

>  C  D  «8  go 

4»  cr  »>  «  n  h 

W  x:  r-:  c/3  E 
Q  o  05  U  ^  £  o 

~  ’J)  4,  ^  oj  ^ 

Q  4)  £  4icQ^  ^ 

.  p  an  p  w 
v>  ?  3  2  u  o  E 

I  2  g-8-a  c  : 
43  5  £  ^  3  S  s 

rt  TJ  ^  g  o  -C 

1  g  8  a  »-o“ 

y  3  ld4  §  8 

a  P  B  u  «  c 

5  D-  41  41  (3  x  (3 

trt  _D  1/1  :  oi  o 

■fl  ifl  in  •.  •  q>  m 

M  3  !i  o  i  fl 

a  o  ,o  ^  p 

o  §  a 

;  ?| ° fr|g a 
t.9  c  43  -o  S  S 
•o  n  a  S  w  ">  s 


s53c 

4l  >  71  41 
U  4)  U  ^ 

o  c  c 

*»  £  ■-  o 
E  J2  •-» 
.2  r.  E  3 

-9  41  o» 

O  to  ?  4- 


®  «»  2  *t3 
a  C  ii  l 
id  <3  -C  3 

4)  q  ^  jt 

■5  J  2  S 

b/>43  -g 
5“«c 

ebb1* 
q  o  m 

,.5  C  •* 

SB«1 
o  2  s  c 
“  «*’ij  o 

1-511 

S  45  .5  43 


o  w  ■  •  *-; 

^  0  8  3  s 

„  S  S  f  " 

2  S2.-2  S  3 

,fc  JJ  .3  £  - 

tt>  D  pi  *-  «J 

pwrai- 

pi  a  > 
£  *  ^ 

o  K  o  _  ^ 

®  *  «  i  u 

-C  4!  c 

w  -C  K  O 

r-  pi  to  >  O 


—  c  *4 

gS^-S 

2.S  2  o.c 
c  -  fen  -3  g-’-^ 
o  «  0  E  ° 

R  JQ  0  -3  Hi  2 

rilii 


K  or-C5  ' 

s§i 

sill 

iii 

p-  4> 

•g  V>  ^ 

1=1 1 -i 


i's  ^!&31gsi 

as  2RcgS1-h 

S«  G.2°U  nOxi 

St  a  i  3  s  g  S.-3  I 

&-S  8‘g  I  -  “|-8  •  & 
f  s-S&ssjegi^  2 

0_4>4>Cp-C«J5  p4J 

i4)SKOuWflOuO 

K  a*3gflj 
2  5  ■“  V5  -rj  g  w»_4J  u  M  r 
°P^wSocD-C>»<u 
)Cti*>,lgyOr34>*n*J 

o  .0  S  XI  ,<3gd.5 
^  4_)  CT1  L  >  4>Ortl 

aglN*  s-s 
-  S  sl-S-ss 

pssipgs*! 

d  to  E  to  i"  ®  d  J  1 

1  s  sp  s  a  §t 3 
S9.s  ijg-gJag-g 


issiiis!  |8sB-p 

15  &^Sl&^s|-sSo 


«  s 

«  *•-  Q  3 
—  o  (/)  Tj 


to  *n  5  « 

*gl  if  g  3  8  |  |-| 


1 ° Sits  If ff! 3  s 

1  a  pI-SI  §  1 

a  « '3  §  ^,"a  -B  -3  C  "3  -3  0  3-9^1 

4Un!S'!|l|l|l1ij 

l|ii8|^|-^:-il 

4  II  c  p  a  q  ^  ?  !!  S  9  C  D 


Si!il||I  III  I 

■§  a  9  tc  .li  3  i  I  s  -  :a  2  o  a 's  -2  £  s  ig  0  “  s  s  2  •«  "g  3 


w.g^/5  h  »■$  § 

g-S,'S3ig§-glI® 

'P  a  §  «  a  8  9  S.-a  ^  8 

till  aS‘Il-9  s 


e  a  a& 
S.a=3  + 
0  *2  ©  ^ 


B  *tj  K  E  *2  S  to  2  0-  2  pj  u  a  5Q  «  ™  3 

5s|i|ias»H*1l-‘|  i’gsiSi 


0  O  w  C 
r  n.jD  v 

&.K  2  g 

111  I 

is  ^ 

-a  .  ^  .y 
m  “5  'fS  fi 
3  >  3  cd 
_  «  c  P 

1111 

SS.sl 


I11i 


it!  |s 

B  5  I  ^.0 

B  I#  C  J!  U 

c  ja  a  g  a 

-I  >,  e  §5 

cf 

"  E  • 

4-2-S  g  ° 

!du 

illal 

*41111 

illSsf 


to  2  *o  «  « 

.s-s  g-s| 

till! 

73  _  c  w 


?  3  6  o  S  s 

l-al.sg-g 

3  M  c/3  C  a, 


a  3  «  2^  S-S  5  S 

■^c  J-S  1  >  “  ca  I 

^oSS8c.^gU 
■a  .9  T)  o  0 


|W«  d.S 

1 1  g  •?  -S  2 _ _ S  "S  a  r§ 


SgajjS 

o-a  o  5  g  c-“  Sag 
B  *  •  «  5  !  !  >  t  u 

4-?'8  2a9-sjas.8 


§1*Jl8Ii  rsls  1 1 

hS  .tSTS  O -S-S x,  1 1{-8  25 g  p  3  5  §  8 


iimmimmmmtmrii 


2  2®  y^«V) 

5  s  |  «  &-SC3 

SJI  si? 
ikUsi 

5|«|P§ 
ftgla  8 


*<2  E  "5  g 


°  =  l  4|-S  3 

-s  g|  S  f  fa  tSllJ'B 

!i|4|ii  !!J^ 

8.  s I  Isj£i-ss  8|l  5-°li  5 1 

<«■«  m-o  E  *w  wS-13  akJ  2  3  k!  «  0  *ri 

■■"wuuuhiiii 

s8sui?n;?ifi|n 

I  e  I 


Dsip.S’5  if’S 

il**s1q19e 

“  **  .  *o  o>  g  g 


Jo  g 

silif!|H|§i^  c|  & 

sisir-s-alssili* 


VJ  _,  u  a» 

-9  |  S  8  g  „  B  »'  u  „ ,_,  —  «  j 

^ajSig»j_  tf-3  ‘r'-s  6  ?  5  ™  ~  C  °  “  o  S  S  S' 

•2  3"E-2b  &  g  i!  £■-§  9  .§.  SSl'SscogfiSS'TjS^ih 

3  ~  “O.aaSS.o  **  =  >*“1  20Sfla^  o3<  j  tJS^ 

9^S":3 


Oo-,cw>SC|3C 

ac«4)G5>S‘Pew 
O  w  kk -G  GrP  D-  *-  5  *3 

J3  ii  _c  bT’tj  o>  E  «J  jO  «*  fl) 

—  s  BbS:^  ..  8  g 


J3  S  _e  w*'ij  &  *-.  « 
5—  s.S^ia 

2  o  C  ,  M  M  U  .  M 

9  #|-F-g  S  S 

X  KM 


w  — j  O* 

P  ™  g  o  5  5 
«  «  .2  f!  a 

s  -a  “  a 


3  3- 


E?c»tc1.8B 

S.,2iSS?gS 


t>  -S  .. 
c  S  w 

Q.  C"S  J? 

I  So  »  a 
oo|S” 

O  g  W  M 

.IsslgSS 5 
s-°r.g3Js&^ 
■B  *  s  s  I  s  I 
H  £  I  til  “-S-9 


II 


oc  O  n  B  - 

S-gi^l-s: 

U  S.  W  ^  BJ  Q 

2  a  &■ 


Nil!ii' 


:  <?.S3  §  E-XJ 

■ssa  1  s.s- 


a  <u 


c 

\2s§o  .8  ■8-2 

ni  §  si-3 

IS-Se'sIsga 
§  8j-|-Di-S-8  1=3  s 

^sj3s.S  9^ 

8^.2^  jjs  Jf|  3 


■§0  I  SQ-fTcf-S  B  1 .9. 


se-ss's-s  b 

aS^i»3oEpC5o, 

«!8&3i^« 

W  C  o. 


O  n  13  q  «i 
-  M  h  jrr  1 


C  . 
O  S 


i  5  c 


d  o  <y 


loi 


_0  “3 
o  B  n  - 
g  8  8  'g  -2  £ 

f  §  8|i|  1 

*  «  3  «  2  C  & 


oi  «  -fl  O  ^  *H 

s)  P  S  U  ^  d  § 
-fl  §  o  73  a>  d 
“  a-0  £_  C 

trie 


I^relso^Jgo  -«{)«_;  = 


*3%  ?l|s  o-S  g  °  -5  SflS  ii^fl'2  21^  -5  s  “-8 

3  i  oi-32«5as^”&«w§s|'|l|-S|1 

I-1-S-M.1  fi &J J'l  I  J  NS 


ft  ^=2 


a  ifltllfjll 


7  5 "2  2  ^  tinW  M  ^  s; 

isliiiliilil^lllilliliiiHIii 


5  £  M 

a  §  1-2 

“  •  8  d  "  s 

•S&Bj8.*JsgS«  5g-SK.fi 
tiSis^a 
I  8  “  §  |  a  §  8  fl 
*>  «s 


4»  C 

_  J3  5  !3  P  $ 

°  °  S' "5  ^  -rj  V?  .Q  £  — 

OUpMD.I”«WyH2 

§  8  »■§  "  jf-2  2  S  S  §  | 

c~  s  §la  5  £ ■8'S‘S 

i|f  |f 1 1 1 

1  "  S-S  S"5'2 

-ilo.i’F°S 

|S3-8  |  4“  d-Sn  §•?  g  3  43 

-t/1‘ - 

£u  "  »  §-3 

s  ■»  "  «  ;  0  § 

3  -o  o  g  g  P.  ir 

«  8 


2  > 


2  p  2  «’ 

*  1-  i  e  a  a 


II 

4>  C3 
X5 

0> 

>>-fl 
-G  ** 


c  ® 


li-ah-Si 
j'lilir 


sirs- 


f 

a 

6 

5 


l«t(  Mir.ii.ijm 

D(  **ilju»  tnpoaaid  »jn£tf  nip  lit  m^tnjnj.-u  n)  <«)|  'ff3p»|MOit^  *ioi||nn  ni  p»iJo«!(lnii  toil  Xif|i(|nrfm  ntnniput  ^ipM'ld  J|.ipmii  r>ii»intn.n  no  pi«rq  «n.inj;.i|iii  p.^mj  ■.snplrj;)  -\  j  ginBrJ 


VM3IOS  JO 
NOIlVinjINVW 
133U30 

VW3M0S  JO 
NouvinjiNnn 
133UIO 

1VOIMJVUO 

swvuovia 

OJ» 

JSI !  V13? 

rw»t  sonoowAi 

3JA1010V<«4 

r 

r 

r 

13S0MS 

0J» 

ttmit’e] 

VIVO 
01AJH30  JO 
NOUVOIJI33JS 

031M3IU0 

300W 

1V3IMJVW0 

swvHovia 

3tnamiiv 

/3JAiens 

/3JA1 

O/KINO/OIIOJV 

AU0W3W  N! 

3JA10100J 

03SV9 

1X31 

r 

r 

r 

1  3S0HS 

was 

*58X001 

VIVO 
C13A!M30  JO 
N0UV0(jO3J5 

SM01VU3J0 

N3AIH0 

A  V1V0UNVW3S 

1VOIMJVUO 

^VUOVIO 

31O0IHLIV 

/9JAiens 

/3JA1 

0311VWW0J 

3/XINO 

/NOHV1SXH0M 

NOS 

swao 

3SVBH3S 

3JA1010HJ 

r 

r 

r 

nsqns 

30ASNI 

M-8WXI 

VN 

VW9IOS  JO 
NO»lV^OJ»Nirvi 

103WI0 

O3SV0  1X31 

SV1VM0V>0 

•U3 

3/x»Nn/it  «3iUnr 

SVW3II0S 

1V0MI0HV|l3tll 

XU0M13N 

1  WNOUVUtf 
S31 VU3N30 

Ainvno 

■NOHOmiUlM 

f- 

/- 

‘HI 

houm  H 

NOItVOlAVN 
IJAfl  3  1J1H1 

VN 

VN 

luninnms 

1V301 A  1N0 

t 

AH0W3W  NI 

3JA10101W 

tVDIMJVUO 

HI 

i«n"ji 

vrimos  JO 

MOaVIOJIMVW 

103M30 

VW3IOS  JO 
NOUVinjtNVW 
lOBtfia 

VN 

WVWOVJO 
W303*'  J 

3/SWA/XVA 

3JA1010VM 

r 

r 

HI 

UOX^Ml 

N0HV3I JID3JS 
AH3HO 

DN1M3IA 
VW3MDS  JO 
WDIOVMVJ 

NOUINI J30 
VW3M3S  JO 
WO*OV||VJ 

VW3H3S  JO 

A  311  lVOS«A 

• 

ONVt 

3i«vii/so/j»noi 

W31SAS 

SWBO  O  l 
30VJW31N1 

3:>VJM31NI  / 

somjvun/ 

/P  / 
/$ 

cf  /& 

A°  / 

Mm 

Mp 

w 

V 

1300** 

VJ  VO 

SDNIHIJIN 

[  SOIMJVUO  JO  3«n 

NOIlVlNimlAll 

/ 

AinvNOHONnj 

ZL 

«»VN^ 


«««;» 


nn 


in»nr> 


u.  g 
a  o  o  o 
Si  a“ 
o*  «  E  -3 

~xi  H 


mt 

I  M“S  ¥ 


43  *-*  C  jC 
ji  c  o  u 
O  5  O  _. 


S-g3  g 


-zfel 

182* 


lsl& 

"So  JB 
■s-gb  g 
g  2  -o  ■“ 

B£  gja 


P  s  °X! 

a  es : 


o  5  "I  §i 

a  &  i  1  ” ; 


,i8|54iSg.|  y 


3  go  l 
a  8  S 
.9  c  fi  -S 

W  O  -S  o 


C  rt  1/1 

ij  w  4> 

a  *  *s 

O  * 


u  ®-  0  3  O 

*•■2  a  a 
•  -  j  g. s  „ 


9°-" 

u  W  H  t' 

c  S  «  00 

al  LJ  O )  /T\ 


S'  g  g  -S 

q_  G  O  *-» 

8  ®  «-o 

?sia 


?.is  s  *  §• 


l-SJlgg-aS 

®-g-§  "I  §  8-s| 

!  ®  8 1 1 311-s 

S-SanSSs 


a  «  i  2  5 

&  9-3  9  o 
a  b  0  a  g 

“  y.  s  s  g 


a>  d  a  v 
o.  S  S  > 


a 

q  o  •*« 

s»l8 

u,  J3-QO) 

O 


“JJ  “  « 

n-3« 

:if*g 


2  S.  S  £  .  2- •« 
■g  3  xi  “-a  -g  § 

*9!*i38! 

Tl^  0J  -»-»  fi  —  i 


■gp-im 

9  5  -2  y  «  m 

V  id  <v  rt  «A  si  r/) 


i3*t3TJ 

U  €  i  B 

.8  j;  £  a 

.a-S-8-S 


°  g  ga 

i|l« 

1811 


-a  “*  b  s 

3  "f9  <*- 

°  8 

sis.™ 

Ilfs 

3  “  ,a  o 

B  S  «  2 

4>  a  2 


7,  o  3  t3  0  g  ' 

'_G  £  i3  -O  43  .O 

P.SS  flWS( 
o  g  tf|  *3  •—  »  « 


:  q  g  S-tJ-S  g  3^5@  ' 
i-aotlBoSi-i  —  D 

i  *j  4)  Xt  -  S3  (/)0  a, 

1  ..  2  A  S  a  w-a  -o  .2  . 


O  _  J-  (ft  ~  ’ 

§.-a  a  §  ^  c  « 
^  s.a  a  S  a  ° 


«  :  "  a.»  ”  s 

1  s  •»  $  |  a a 

2  43  *i  3  o  M 

£  «s  13  *  g  3  g 

ho  H  fl  u  5!  "9  d 


3 -S  o  fc  §  O  S-I-SO 

liiljunr 

*  §  s  -|  *■§  &'S  g. 

2-g?  g  g-s  *4  3  “ 

HlilmHS 


a  .3  »  ~  o 

>  **  <»  5*  o 

s  JS  Sf's  S  “  - 

3  u  S  °  9  -c 

>  s  E  c  S-  “ 

j  »  o  o5u 

s-rf  S'S  8  g 

,  g  K-l-8  -■ 

f  W)  o  O-  *-*  -*2 

ii  a  o  .  o 

3  >  a  «n  « 

a  C3  B  4>  4>  p 

2  3  S  -a  r  ° 


'  vi  4  c  X 

•  <]  —  o 

>  4>  □ 

i  & s  %  1 

>  o  M  |  J  41 

;  8-SS- 

>  3  o  % 

5  o  c  -2 
,  t;  o  _ 

3  u  UTJ 
J  O  4» 

;  .o  w  w»xj 

:  «  y 


vi  n  u  ^ 

Jr” 

ij  ”  *p  fc 

Q  .S  W>  4)  O 

sSssa 


4>  4>  flj 

«  >  X3  »- 

g  I  >>o 

■2  f»  3 

■3  5^ 


j  c  D  4)  4>  p 
2  a  5  j3  c  b 


j  T3  4>  .O 

Ill'S  81 


I|gS 

0  S  3  n 
0  j  I 
o  s  e-s 


a-3  “'2  9 

2  o  9  rt  fl 
°*'o  ?  S  g 
X  c  P  “  c 

£  8  «  »  ” 
J  Sg  u  r 

Z%  ll  E 

52  c  “  u 

7  ™  B  « 

v  4>  br  u 

fl  u  C  M  if 

C  >  c  *  c 

e  °  =  K-S 
N  «  s  P 

3  B  «  a 

y.O  >.  o  o 
r  *  n  3  tl 

§>  i 
2s  5 

Q  O.  O  *n 


iiijiiiiiiia 

_  tizL  a 

O  ri  —>  to  p  -ri  ...  4)  O 


«3  ii  Q  4)  •  (ft 

-sf-sS  §9  g  •  8 

llll^lsi 

o  S'  S  ^3  S,  P  5 

■ii!s?ill9 

a  °  J  a  s  .9  &s  ■« 


i  §  §5 

.9  is 

S  § 

n  8  a  I 

*®  rt 


l0^oOS„5feesaSia 
,|.9S|  g  S  M  .  8 

i ss  r o"st  c®«« 

33  a  P  a  s  8  ■°  g  -£  g,^ 

c  s  o  g  «,s  ^  »  u  si 

at  23^  .5  fi 

s  °  Si:-2  c-S  a  i  »& 

Q^«'9P9J<U^'HOoScfl 

9-5ils'iJiSl.:  ag 

s  g-sf.; ! »  i-sj  id 

la"5 1 1  illlSI'l  Ite 


it  C  a  Sri-  g 

1  all®  s 

1  «  2  P  2  » 

^  Q.  G  vj  _n 
D-u  O  «5  -a  n> 

-rj  C  *3  G  .H 

5  4)  <a  o  CiJ 

anjii 

<2  S  <  -f  1  " 

JJt*  w  &  ..  « 

JS  ® 

3  g.’S  .g  g  g 

85  >,jo  l  3 
^  5  rt  w  SJ-2 
n  w  J  G  m  « 


3  o  S  tj  y  '3  a 

0 si  i>‘s 
£-l£-9&G3a 

s si <!&:!» 
■J  |l|l^  ia| 

«  B  3  >  0  fljJ 
9— <-rj.  , — ,  o  ^  it  ^ 

1 11  •383-a  II 

g  §i-ssi  9  as 


3  0 5 a  - 

2  _|  u  x  "u  S  o 
5J  0  22  o 
3-9  g  S  u 

S|t!5^3 

a'S  c  9  =1 

a  o  s  1  = s 

1  s  b  g  a 
Ssl|°| 
8uJ 3  a  I 
8  I  £  I .§• 


C_ZE4ibr 

“•  »-  o  2  c 
>.cr>  -n  *1 


E  CJ  S  a  On  4, 

3  e  0  • 

i<  s  8  S’S 
-d  affl-s  a-g’s 

¥um 

8  £s.gim 


w  n  y 

^3  a  ed 

■gas 

ag« 

g-2-s 

§  E  nr 


x«  a  0 :3 

s.2:a  „  J 


g  —-a  2  u  2 

*■="55  "j 
c  5  q  .  ts 
o  C  3  _  *1  O; 

C  4)  T3  .G  o 

9a|“i  a 

a  2  3  3  B  - 
0  «  §  S  g  J  -g 

%" §  Si!  I 

^  jj  ^  in  *7  -O  4) 

.  ii  n  r"  'o  o  £ 


C  H  «ft  C  •  4;  p 

-  3  3  v  0  J2  5 

slill at 


n  “  „  22  <  _■  S 
pJ-S^  d  o'S 

a°  C i §  &Q 

S  “-o  O  a* 
S.S'ifS  So 

8  9 .3  S-  s 


«  2  B  S  „  B- 


iiis^: 
Ii  i-jl-t 
■SHitl 

TJ  Q  t!  o  ^  43 


ce01,o._ 

G  n  •=  o  G  a  o 

ii  y  ^  in  «  p  {• 

•ri  C  p  in  n  g  t 

l;|g!|§ 

(4  §  °  C  p’S  ^ 
fe  g  q  |o.  3  § 

a1S^&s^ 


3  .  -  0  S  3H 

o  8-  S  J-g.2 

B  B  s  :.=  s 
9  S& 
»S2  S*. 

i  2  2  d.  _c 


I-*  ^ 

>  S  8  8  B-3 


-s  g  g  g  -2  I  s 

»  B  S  3  &  *>  g 
.9  S  .9  8  S  a! 


S-20Sq°g|0 


-&8lla|-Sl! 

—  n  *i  r*  V) 


s-aa&sid 


3*1  *  §11 
g  °  §  3  a 


Sc <  q  i-So  J 

,|S  a 

sl  S  s  »  °-SB-2 

>S™^t3Sc3T=tj 

.gcB^.Ssoqo 

inriljff 

HI  U!;:h 

I.  o  ^  .i  q  n 

I  ..ss-SaSalo 


|sl|]|lffi>. 

JlS&ijJU 


I gs glal^l 
<s8rll2§«3 

SS's«o,.S0' 

g-8,^  38  so 

9  &S  a  S-gg  a -3 
II  “-a. g  9-  §  g 
Ba  8 .9 1  S*S  b 

™  §1  E  E  8 

!§f-i£s2t 

iJgJss  f2| 

&ES  8 

E  g  oiHm  “2  3 


*  «  61  ^  Si-8 

hi  rjiH 
si!  Ill  if. 


s|l s  8 1  itsli  1 1| 

C  fl  .TJ  U  4)  43  C  5*-° 

bnl  brE  p  Of -S  a„  y.5 


ki  41  •  < 

S  jd  a  c 
O-  _  o  q, 
3  C  o 

ni 


.2^k“3°  i  S  a  2  s  a  1  c-  g  4  I  3  So  ^  P  S  P  S  Ji>  g5 

9  xS  a§u.s  si  g  §  go  si  “sgasBl  !«  -  ».sl  g 


«  E  95  2  o  "  4  o.  ~ 

a-s^  s  9  «>•  o„g 


« s.„-s 

~  o  O 

IS  cl 

9*3  |- 

3  g-3 

laBf- 

9  «  S  S  S 

il;|9. 


qil  o 

Sill 


8  3 

ills 


;  b3s  § 

rf  * 


filial ;  B?-S  *1!  M-s  «  s 

pall'!  f |  g!  M  SSf £  a  P 8  JsSfl g |^>il 

b  n  7  “  S  S  C0K*n.2>G  o.  ao  n.D  o.  2.  w  C  -n  ^  a 


i  b9.3 

ItsIs 

-°  e  -G 

U  41  H  C 

G>  C  g  .2 

5?s  § 

«  ^  X!  g 
JJ  P  .  «> 

S 


3  B  o.  p» 

U$  I 


.-s,  ? 
si? 


s  d  S;5 


.1 1 1  j  ii |  i|i| f  iiiiii  ;i 
ini1  ii:iij  iiiiiiiiiiigiii  i 

!  21#*  ;l  sir  e-i  si  l!j|  s-s 

ls^ls:l1l8sl8  8-9i-i!B8!sUi  l! 


!  .9  a'-g 


;gn 


mS  G 
8 s  8  & 
ig  ™(§ 

«  -n  fi* — • 


<141 

ll|| 

fill 


•7  W 

c/5  9  rd 


8  2  sl  a1's 

■9|il|l§ 
ill  sin 

l  l 1 1-2  It 

s J Is  I 

ISpll'S 

mm: 

^•s  s  'g  3  s  s 

O  m  O  9 


q  07  a  I  0  8 

-fl  'l  i  ill  a 

lag  |.g  8g  ;Si. 


|  2  g  “-O's-RS  xfc 

1  a!  3  g.  =F  s  a 

■2fl.3oS??WijS® 


fiillfliiilllHlII- 


p  u  p  ** 

I  I|il3 

p  fc-SlsJl 


g  a  3  g  3  &  S’  -I  5  : 

0  8  a  s  3  S.  S  g  -  f 

x c  «  g xi  y? 4) 
2oS0SuSSSTi! 


^§g.iglip 

jd£fcllq|| 


“  E  S'® 

llll' 


mt  jntniiPM 

It2  o  II 


«  Oi  X  g 


lSsl|3 

ScajiSfB- 
aJB^SSo 
g«  S  *  >, 

«J  O  m  o.  0  ^  "aJ  p. 

III:  ill] 


? 


J.  Jfto. 


o  °i  3  i 

115  f 


•a™0  2  -g§-§  a  S2„  I  §  §  o 


11  III i 
lt?i 


£  U>i)  fc  v  c  . 

Jlllf  9  s  i  i-g.-o 

E||  S'?  £H  8  «  ”■%  9, 
3  1J- §  3  §  g-glf  8  c  i 
a  8-3  6'5  S-8  8-5.S-! 


e  2  «  I  a  .&.§ 

f t*|ii 

S-g  sj 
8  2  J 


2  Jig -a  £■»  e 
g  8f  J  E-s  fo 

sjs  §  ,|e  e  8 » 
■3 1  sSB|  s  s-s  £s§ 

IbHII  Hi  lit  IP  t  llllll!  !l|!l!lHl  I 


3  8 


;  "3  -S  “  c  .8 

g  Sir 

g  >'  B  ! 


CO  ' 


min  i, 

T  I  t  5  J  3  !  7.  J 


I 


2  f  s  & ! 


f !  jjl  ^ 

HI 

3  3  *»2  e  &(2  “  S 


1121 


^  «  -j  •  •  ** 

1  I  X 


ll8x£i'sSi 


I  t>  U  *M 

X  E  2  © 


v  o  ►»  w*  j* 


|  "a 
sa  g 


•g  .s  c 

c  JU  “is  f 

o’S.s  6 
e 8  2 ra  s 
Is  g  .i  S  I 

^  e  2  v,  -C 

g  °  S  *  s*  « 
?  c 


‘ini 


jjfdl 

§•  I  2  E 
sasa 

-Jim 


«i  s|  iju  a 

|§s^3f 


s .1  §.|2  gils^-Eaa* 

Q  r  n 


^  «  13  s 


iillglllUla 

§-§.gSj92llJ.§5-S,s-a 


H  £  3 
B  »  1  2  S  a 

p  H  "  o.  o  «  S 

'|8|^l 

.-fflli 


s 

E 


d  e  2  S 

Sail 

5  fa1-. 

5  r.liii 
?  4I?f| 

s  liilif 

illlH 

3 


I 


i 

E 


U 

S 

u 

-t 


!  o  |B 

S  &  *;  s 

’Wf3JS  m 

i  s-8-S 
Mi 


uauasiKiiiiiH 


1  'VP  «'J»  A.»VT  \Lm  nrWKjrl PFWW. 


Lvw,v.wA-Awwiaaw^ 


...  «.  f. ‘  »■•'«.<  » i  i,i  i.»:M'L^i.lSW|VlVlViVrt; 


< '  »*-  |t|  |U  « |,«  M 


m 


(BEE  TRANSACTIONS  ON  SOFTWARE  ENGINEERING.  VOL.  14.  NO.  6.  JUNE  19*8 


The  Cactis  Project:  Database  Support 
for  Software  Environments 

SCOTT  E.  HUDSON,  member,  ieee,  and  ROGER  KING,  member,  ieee 


:! 


Abstract — The  Cactis  project  is  an  on-going  effort  oriented  toward 
extending  database  support  from  traditional  business  oriented  appli¬ 
cations  to  software  environments.  The  main  goals  of  the  project  are  to 
construct  an  appropriate  data  model,  and  develop  new  techniques  to 
support  the  unusual  data  management  needs  of  software  environ¬ 
ments.  including  program  compilations,  software  configurations,  load 
modules,  project  schedules,  software  versions,  nested  and  long  trans¬ 
actions,  and  program  transformations.  The  ability  to  manage  derived 
information  is  a  common  theme  running  through  many  of  these  un¬ 
usual  data  needs,  and  the  Cactis  database  management  system  is  unique 
in  its  ability  to  represent  and  maintain  derived  data  in  a  time  and  space 
efficient  fashion.  A  central  contribution  of  Cactis  is  its  integration  of 
the  type  constructors  of  semantic  models  and  the  localized  behavior 
capabilities  of  object-oriented  database  management  systems.  The 
Cactis  database  management  system  is  nearing  completion. 

Index  Terms — Database  management  system,  software  environ¬ 
ments. 


I.  Introduction 

THE  goal  of  the  Cactis  project  is  to  extend  the  useful¬ 
ness  of  database  technology  from  traditional  database 
applications  to  engineering,  and  in  particular,  software 
and  software  environments  (51,  [9],  [39],  [41],  [45].  This 
means  developing  database  facilities  for  the  management 
of  the  myriad  of  unusual  processes  and  data  types  in¬ 
volved  in  the  support  of  the  software  lifecycle,  both  for 
small  scale  programming  tasks,  and  for  programming  in 
the  large  [1 1].  These  include  the  design,  coding,  and  de¬ 
bugging  of  computer  programs,  as  well  as  the  creation, 
maintenance,  and  reuse  of  modules  and  versions.  Soft¬ 
ware  environments  are  also  designed  to  manage  docu¬ 
mentations.  requirements  specifications,  schedules,  bug 
reports,  test  data,  etc. 

Researchers  have  noted  the  unique  database  require¬ 
ments  of  software  environments  (21,  [21].  In  this  paper, 
we  describe  the  general  scope  of  the  Cactis  project,  which 
involves  the  development  of  a  DBMS  called  Cactis,  as 
well  as  the  construction  of  Cactis  application  software  de¬ 
signed  to  support  software  environments.  We  discuss  the 
manner  in  which  the  Cactis  DBMS  provides  a  unified  ap¬ 
proach  to  satisfying  many  of  the  needs  of  software  engi- 
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neering.  Cactis,  as  it  stands,  is  a  multiuser  centralized 
database  management  system.  A  brief  preliminary  report 
on  Cactis  appears  in  [20]. 

In  Section  II  of  this  paper,  we  discuss  the  Cactis  DBMS. 
Our  focus  is  not  on  the  construction  of  database  systems, 
but  on  the  special  data  modeling  capabilities  of  Cactis  and 
how  they  can  be  used  to  support  the  software  lifecycle. 
Therefore,  in  Section  II  we  concentrate  on  the  Cactis  data 
model  and  the  methods  Cactis  uses  to  manage  complex 
data.  The  data  model  of  Cactis  includes  powerful  type 
constructors  necessary  to  model  such  objects  as  programs 
and  program  versions.  Unlike  business  applications,  there 
are  also  many  forms  of  derived  data  in  a  software  envi¬ 
ronment.  These  can  include  coarse  grained  data  such  as 
program  compilations,  software  configurations,  and  load 
modules,  as  well  as  fine  grained  data  such  as  the  control 
and  data  flow  properties  of  individual  modules,  state¬ 
ments,  or  expressions.  The  type  constructors  found  in 
Cactis  are  derived  from  semantic  databases  [24],  [29].  To 
manage  derived  information,  Cactis  uses  techniques  in  the 
spirit  of  object-oriented  databases  [13],  whereby  the  rales 
necessary  to  calculate  derived  values  are  embedded  in  da¬ 
tabase  objects.  Thus,  Cactis  is  both  a  semantic  and  an 
object-oriented  database. 

In  the  third  section,  we  describe  on-going  efforts  to  ap¬ 
ply  Cactis  to  software  environment  support.  This  work  is 
in  progress  and  is  incomplete.  Our  goal  is  to  construct  a 
real-life  software  application  on  top  of  Cactis.  and  to  use 
it  as  a  way  of  evolving  the  functionality  and  implemen¬ 
tation  of  Cactis.  In  Section  III  we  focus  on  how  Cactis 
may  be  used  to  support  the  unusual  forms  of  data  manip¬ 
ulation  required  by  a  software  environment  DBMS.  The 
needs  of  a  software  environment  database  range  over  a 
broad  spectrum  of  capabilities.  For  example,  a  software 
environment  must  support  fine  grained  data  about  individ¬ 
ual  modules  and  statements  for  use  in  optimization  of  code 
within  a  compiler;  a  database  can  simplify  this  task  by 
allowing  a  program  to  be  viewed  as  a  number  of  data  ob¬ 
jects.  and  by  directly  supporting  the  primitives  needed  to 
implement  data  flow  analysis. 

To  support  aspects  of  programming  in  the  large  within 
the  same  framework,  the  database  should  also  support  the 
manipulation  of  programs  as  a  whole  and  even  entities 
larger  than  single  programs.  For  example,  a  common 
software  tool  is  the  “make-’  capability  found  in  UNIX* 

•UNIX  Is  a  registered  irademark  of’  AT&T  Bell  Laboratories. 
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[151  and  other  environments  [8],  which  is  used  to  control 
recompilation  of  programs  based  on  last  modification 
times  and  mutual  dependencies.  Cactis  can  easily  be  pro¬ 
grammed  to  perform  these  tasks.  Similarly,  Cactis  can  be 
used  to  manage  entities  which  represent  schedules  and 
other  management  data  which  may  transcend  a  single  pro¬ 
gram.  Another  example  of  special  requirements  placed  on 
a  software  environment  DBMS  is  that  the  user  is  likeiy  to 
desire  database  transactions  whose  durations  are  much 
longer  than  traditional  business  transactions.  A  typical 
transaction  might  be  a  program  bug  fix.  which  could  in¬ 
volve  a  long,  interactive  period  of  updating  a  program, 
then  the  processes  of  recompiling  the  program  and  recon¬ 
figuring  any  system  which  uses  it.  This  may  also  call  for 
the  use  of  nested  transactions,  to  support  the  many  subac¬ 
tivities  involved  in  this  procedure. 

To  date,  the  main  contributions  of  the  Cactis  project  are 
the  integration  of  semantic  and  object-oriented  database 
mechanisms,  and  the  ability  to  manage  derived  data  in  a 
space  and  time  efficient  fashion.  The  mechanisms  used 
within  a  Cactis  database  to  manage  derived  data  are  based 
loosely  on  attribute  grammar  techniques  used  in  compiler 
construction  [32],  [33],  as  well  as  from  more  recent  work 
on  incremental  attribute  evaluation  [12],  [43]  used  in  syn¬ 
tax  directed  editors.  These  mechanisms  may  be  used  to 
implement,  in  a  uniform  fashion,  the  various  forms  of  de¬ 
rived  data  found  in  a  software  environment.  Another  con¬ 
tribution  of  Cactis  is  an  efficient  rollback  and  recovery 
mechanism,  which  is  of  primary  importance  in  order  to 
support  long  nested  transactions  and  versions. 

"Hie  founh  section  of  this  paper  describes  the  current 
status  of  Cactis.  It  also  describes  future  goals,  and  in  par¬ 
ticular  discusses  the  manner  in  which  we  intend  to  exper¬ 
iment  with  our  prototype  system.  As  little  is  known  about 
the  sorts  of  real-life  data  that  will  be  found  in  environ¬ 
ments  of  the  future,  we  propose  instrumenting  Cactis  with 
software  which  may  be  used  to  gather  statistical  infor¬ 
mation.  We  w,U  then  distribute  Cactis  for  experimenta¬ 
tion.  In  this  way.  we  may  learn  critical  information  con¬ 
cerning  such  things  as  data  object  sizes,  transaction  types 
and  durations,  and  rollback  and  recovery  needs.  This  will 
allow  us  to  evolve  Cactis  into  a  more  useful  system. 

In  sum.  the  Cactis  project  clearly  has  very  broad  re¬ 
search  goals.  It  is  intended  to  support  the  database  needs 
of  the  entire  software  lifecycle,  and  thus  incorporates 
much  of  the  functionality  of  a  software  environment.  In 
this  paper,  we  address  the  unique  data  needs  of  environ¬ 
ments.  and  show  how  the  Cactis  DBMS  is  able  to  support 
them  with  a  small  set  of  data  modeling  and  data  manipu¬ 
lation  mechanisms.  The  general  goal  of  the  project  is  to 
centralize  all  of  the  database  functionality  of  a  software 
environment.  This  will  minimize  the  effort  needed  to  con¬ 
struct  environments,  and  will  help  allow  the  design,  use. 
reuse,  and  maintenance  of  software  to  be  performed  effi¬ 
ciently. 

II.  The  Cactis  DBMS 

The  term  '‘object-oriented”  is  a  widely  used  term,  with 
many  interpretations.  It  is  also  a  very  popular  term  at  this 


time.  While  we  do  not  claim  that  Cactis  meets  everyone's 
definition  of  object-oriented,  we  do  feel  that  it  reflects  the 
two  major  interpretations  or  the  term.  It  supports  static 
(or  structural)  objects,  by  providing  type  constructors 
along  the  lines  of  a  semantic  model.  Cactis  also  allows 
objects  to  have  local  behavior,  in  the  form  of  procedur- 
ally-derived  attributes.  Below,  we  describe  the  Cactis  data 
model  and  the  manner  in  which  is  has  been  implemented 
in  order  to  ensure  a  reasonable  level  of  efficiency.  As  the 
goal  of  this  paper  is  to  describe  the  application  of  Cactis. 
and  not  the  details  of  its  implementation,  we  describe  the 
system  only  enough  to  provide  the  proper  background  for 
Section  III. 

Other  researchers  have  built  object-oriented  database 
systems.  A  number  of  object-oriented  database  research 
projects  are  described  in  [13].  These  projects  range  from 
database  implementations  of  the  message  passing  para¬ 
digm  of  Smalltalk  [35],  [36]  to  extensible  systems  which 
are  designed  to  support  objects  of  varying  size  [6],  to  sys¬ 
tems  designed  to  support  general  engineering  and  multi- 
media  applications  [37],  [50],  We  feel  that  Cactis  is 
unique  in  its  clean  intermixing  of  structural  and  behav¬ 
ioral  mechanisms,  and  in  its  efficient  implementation. 

A.  The  Cactis  Data  Model 

Traditionally,  database  applications  have  used  database 
systems  which  are  based  on  simple  record-oriented 
models.  Models,  such  as  the  network  or  relational  models, 
essentially  represent  an  object  as  a  flat  record  structure, 
consisting  of  a  finite  number  of  fields.  Each  field  contains 
a  fixed-length  printable  value.  Thus,  when  manipulating 
a  traditional  database,  an  application  program  typically 
processes  a  large  number  of  identically  structured  records 
and  performs  a  similar  manipulation  on  each  record.  An 
example  might  be  deducting  federal  tax  from  each  of  per¬ 
haps  thousands  of  employee  payroll  records.  It  might  be 
necessary  to  relate  other  information  to  each  record,  and 
if  so,  logical  or  physical  pointers  are  used.  For  example, 
occasionally  an  employee  may  have  a  partial  tax  exempt 
status:  in  these  cases,  the  payroll  record  might  point  to  a 
record  in  another  file  which  tells  the  payroll  system  how 
to  calculate  that  person's  tax. 

There  are  two  very  important  distinctions  that  separate 
traditional  business  applications  and  engineering  database 
applications.  First,  the  objects  being  manipulated  are  typ¬ 
ically  not  as  easily  represented  as  simple,  flat  records.  And 
second,  an  engineering  database  does  not  usually  have  the 
very  low  schema  to  data  ratio  that  is  commonly  found  in 
business  applications.  This  means  that  there  are  fewer  ob¬ 
jects  of  a  given  type.  These  two  distinctions  cause  the 
record-oriented  batch  mode  of  business  databases  to  not 
be  as  useful  in  an  engineering  application. 

Specifically.  VLSI  and  PCB  designs.  CAD/CAM  ob¬ 
jects.  and  software  objects  are  often  very  intricate.  These 
applications  have  unusual  data  modeling  needs,  such  as 
the  ability  to  allow  an  object  to  have  structured  or  com¬ 
plex  attributes,  and  the  ability  to  represent  unusual  forms 
of  data,  such  as  derived  data  [20]  and  versions  [28],  Fur¬ 
ther.  when  manipulating  engineering  databases,  a  user 
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often  deals  with  a  smaller  number  ot'  much  more  intricate 
objects.  Commonly,  a  business  user  would  manipulate  at 
one  time  many  employees  in  a  payroll  database  or  savings 
accounts  in  a  bank,  whereas  an  engineer  would  manipu¬ 
late  only  a  few  modules  in  a  software  system  at  one  time. 
The  engineer  might  also  need  to  keep  track  of  multiple 
versions  of  one  program.  In  order  to  support  the  data 
modeling  needs  of  a  software  environment.  Cactis  has 
drawn  on  two  areas  of  recent  database  research,  semantic 
and  object-oriented  data  modeling. 

I)  Constructed  Types:  A  Cactis  database  is  viewed  as 
a  collection  of  objects  similar  to  the  objects  found  in 
Smalltalk  [17],  Each  data  object  has  a  group  of  data  val¬ 
ues  attached  to  it  called  attributes.  An  attribute  may  have 
a  value  of  any  C  data  type  (except  pointer  types).  The  set 
of  attributes  attached  to  an  object  is  determined  by  the 
type  of  the  object.  The  type  of  an  object  may  be  statically 
determined,  or  may  vary  dynamically  on  the  basis  of  a 
predicate  that  defines  subtype  membership.  As  the  attri¬ 
bute  values  of  an  object  change,  the  object  may  meet  or 
fail  to  meet  the  criteria  for  inclusion  in  a  particular  sub- 
type  and  hence  be  moved  up  and/or  down  in  the  type  hi¬ 
erarchy. 

In  addition  to  an  internal  structure  defined  by  attribute 
values  objects  may  be  connected  recursively  by  typed  and 
directed  relationships  to  form  higher  level  or  abstract  ob¬ 
jects.  Thus.  Cactis  is  a  semantic  model,  similar  to  the 
entity-relationship  model  [7]  and  the  semantic  data  model 
[18].  It  is  actually  based  on  the  insyde  model  [30].  In 
semantic  modeling  terminology  [24],  relationships  are 
used  to  construct  aggregations,  or  complex  objects.  Un¬ 
like  conventional  record-oriented  database  objects,  aggre¬ 
gations  may  have  properties  (relationships)  which  are  not 
printable. 

An  example  type  would  be  a  Load_Module  as  shown 
by  example  data  objects  in  Fig.  1.  Simplisucally.  a 
Load_Module  might  be  modeled  as  consisting  of  objects 
with  one  multivalued  relationship,  called  Modules,  which 
relates  objects  of  type  LoadModule  to  objects  of  type 
Module.  Modules  themselves  would  be  aggregations  re¬ 
lated  to  other  Modules.  Load_Modu!e  might  also  have 
three  attributes,  one  giving  its  name,  another  the  date  it 
was  last  formed,  and  a  third  giving  the  name  of  a  file 
containing  the  actual  object  code.  An  example  subtype 
might  be  Recent_Load_Modules.  which  would  include  of 
all  load  modules  less  than,  say  a  week  old. 

In  addition  to  the  subtyping  and  aggregation  capabili¬ 
ties.  it  is  important  to  point  out  a  fundamental  difference 
between  the  Cactis  data  model  and  conventional  models. 
This  difference  concerns  the  nature  of  relationships.  In  a 
conventional  model  the  type  of  a  relationship  uniquely  de¬ 
fines  the  types  of  the  objects  which  are  related.  In  the 
Cactis  data  model  this  is  not  true.  A  relationship  only  de¬ 
fines  the  type,  direction,  and  number  of  values  that  flow 
between  objects  (see  the  next  section  on  local  behavior). 
Consequently,  the  exact  type  of  the  object  at  the  other  end 
of  a  relationship  is  not  known  until  the  relationship  is 
traversed.  This  allows  the  types  of  objects  to  change  or 
be  extended  dynamically  without  affecting  related  ob- 


Fig.  1.  Example  objects  or  type  Load_Module  and  Module. 


jects.  The  details  of  how  an  object  is  implemented,  such 
as  its  exact  type,  its  storage  structure,  or  how  it  derives 
data,  are  all  encapsulated  within  the  object  and  hidden 
from  the  objects  related  to  it.  For  example,  in  Fig.  1  we 
could  transparently  substitute  an  object  of  type  New_ 
Module  for  an  object  of  type  Module  so  long  as  the  same 
external  interface  is  maintained  (i.e. .  the  same  connecting 
relationship  type  is  used).  This  means  that  an  object  can 
be  transparently  related  to  an  object  of  a  type  that  did  not 
even  exist  then  the  object  was  created.  This  feature  is  cru¬ 
cial  for  dynamically  evolving  systems  like  software  en¬ 
vironments. 

2)  Local  Behavior:  The  Cactis  data  model  is  also  an 
object-oriented  model,  in  the  behavioral  sense.  This 
means  that  individual  objects  in  the  database  have  embed¬ 
ded  within  them  the  means  to  respond  to  changes  else¬ 
where  in  the  database. 

In  a  Cactis  database,  each  object  is  an  encapsulation  of 
data  along  with  a  mechanism  for  implementing  local  be¬ 
havior  involving  the  data.  The  specific  mechanism  used 
in  Cactis  to  support  local  behavior  is  that  of  derived  at¬ 
tributes.  This  mechanism  allows  attributes  to  be  derived 
by  a  function  of  other  local  attributes  of  the  object  as  well 
as  attribute  values  imported  into  the  object  over  relation¬ 
ships.  As  a  consequence,  the  interface  to  an  object  con¬ 
sists  of  its  relationships  and  the  values  imported  and  ex¬ 
ported  across  those  relationships.  An  object  can  be  seen 
as  expecting  certain  types  of  values  from  its  environment, 
and  providing  other  values  in  return.  The  data  language 
of  Cactis  [21)  allows  derivation  functions  to  be  con¬ 
structed  using  all  the  arithmetic  and  logical  operators 
found  in  the  C  language,  plus  several  conditional  and  it¬ 
eration  constructs.  Because  of  this  flexibility ,  very  com¬ 
plex  behavior  can  be  encapsulated  in  a  Cactis  object.  The 
only  restriction  on  these  functions  is  that  they  may  not 
have  side  effects  (must  be  applicative).  Examples  of  such 
behavior  include  such  things  as  prioritizing  bug  report  ob¬ 
jects  on  the  basis  of  deadlines  derived  from  scheduling 


712 


IEEE  TRANSACTIONS  ON  SOFTWARE  ENGINEERING.  VOL.  W.  NO  IS.  JUNE  I9KH 


object  as  discussed  in  Section  III-A.  or  update  of  data/low 
analysis  information  as  described  in  Section  III-B. 

3.  The  Implementation  of  Cactis 

In  this  section  we  focus  on  algorithms  used  by  Cactis 
to  manage  derived  data,  as  these  are  the  features  of  the 
system  which  most  heavily  influence  its  ability  to  manage 
software  databases.  For  further  information  about  the 
Cactis  implementation,  see  [23]. 

1)  Derived  Subtypes:  In  a  Cactis  database,  the  subtyp¬ 
ing  mechanism  has  the  potential  of  creating  a  large  amount 
of  derived  information.  In  order  to  prevent  a  storage  prob¬ 
lem  from  occurring,  it  is  possible  to  allow  Cactis  to  de¬ 
cide  when  a  subtype  should  be  materialized.  The  system 
supports  an  option  which  allows  it  to  be  self-adaptive,  in 
that  it  continuously  monitors  itself  to  determine  which 
subtypes  should  be  reevaluated.  Cactis  will  keep  track  of 
the  usage  history  of  each  subtype,  and  then  use  this  in¬ 
formation  to  decide  when  to  materialize  a  subtype.  If  this 
option  is  chosen.  Cactis  will  only  materialize  a  subtype 
under  one  of  two  conditions:  if  the  subtype  is  queried,  or 
if  usage  statistics  indicate  that  it  is  cost-effective  to  ma¬ 
terialize  it.  This  allows  the  system  to  use  less  space  and 
still  provide  quick  access  to  certain  subtypes. 

Under  this  option,  Cactis  keeps  an  exponentially  de¬ 
caying  average  ( by  powers  of  2)  of  the  how  many  times 
each  subtype  has  been  referenced  during  the  previous  da¬ 
tabase  sessions.  This  is  multiplied  by  the  number  of  ob¬ 
jects  which  must  be  reevaluated  for  subtype  inclusion 
(which  can  be  derived  from  the  object-level  dependency 
information),  in  order  to  give  a  weighted  factor.  These 
weighted  factors  give  an  indication  of  which  subtypes  are 
highly  volatile  and  are  commonly  referenced. 

Subtypes  with  large  weighted  factors  are  materialized 
frequently,  so  that  the  system  is  able  to  anticipate  query 
needs.  This  minimizes  the  delay  involved  in  waiting  for 
a  response  to  examine  a  subtype,  but  does  not  materialize 
infrequently  needed  subtypes  that  are  not  volatile.  The 
database  designer  may  provide  a  parameter  to  decide  how 
large  a  subtype's  weighted  cost  factor  must  be  to  mate¬ 
rialize  it.  This  parameter  is  chosen  with  respect  to  the 
available  storage.  We  note  that  this  technique  assumes  a 
certain  amount  of  locality  of  reference  with  respect  to 
subtype  examination. 

2)  Derived  Attributes:  The  mechanism  used  to  imple¬ 
ment  derived  attributes  is  efficient  in  time  and  space,  (t  is 
based  loosely  on  recent  work  on  incremental  attribute 
evaluation  (1-1.  (43 (  and  is  philosophically  similar  to  the 
mechanism  used  to  maintain  subtypes.  In  particular,  the 
data  managed  by  Cactis  can  be  seen  as  an  attributed  graph. 
Such  attributed  graphs  are  a  generalization  of  the  attrib¬ 
uted  trees  used  for  syntax  directed  and  language  specific 
editors  (44).  Unfortunately,  the  optimal  incremental  up¬ 
date  algorithm  developed  for  this  process  (421  cannot  be 
extended  to  attributed  graphs.  Instead  a  new  incremental 
update  algorithm  has  been  developed  for  use  with  Cactis. 
Other  algorithms  for  incremental  update  of  attributed 
graphs  are  discussed  in  (1).  (27).  However,  these  algo¬ 


rithms  are  either  not  suitable  for  use  in  a  mass-storage 
environment,  or  do  not  handle  arbitrary  attributed  graphs. 
Another  system  which  relates  derived  attributes  in  a  con¬ 
ventional  attributed  tree  to  a  relations  in  a  relational  da¬ 
tabase  is  discussed  in  (19).  Finally,  a  system  for  standard 
incremental  attribute  evaluation  using  a  variant  of  the 
standard  optimal  algorithm  in  a  distributed  environment 
is  discussed  in  [26] . 

The  Cactis  update  algorithm  is  optimal  in  the  amortized 
set  of  attributes  recomputed  after  a  change,  but  somewhat 
less  than  optimal  in  total  overhead.  To  be  specific,  when 
we  amortize  over  any  complete  transaction  sequence,  the 
set  of  attribute  reevaluations  charged  to  a  particular  trans¬ 
action  is  the  set  of  attributes  for  which  both  of  the  follow¬ 
ing  two  conditions  hold.  1)  Either  the  computed  value 
would  change  if  evaluated,  or  the  computed  value  of  ai 
least  one  attribute  directly  related  to  that  attribute  would 
change  value  (this  is  the  minimum  set  of  attributes  to 
evaluate  if  all  attributes  must  be  updated  after  each  trans¬ 
action).  2)  The  attribute's  value  will  eventually  be  used. 
That  is.  the  attribute  will  be  referenced  at  least  once  be¬ 
fore  either  the  end  of  the  transaction  sequence  or  the  point 
where  its  value  is  overwritten  by  a  different  value.  This 
is  an  amortized  bound.  Consequently,  some  transactions 
may  perform  more  work,  but  as  a  consequence  other 
transactions  will  perform  less  work.  A  complete  analysis 
including  complexity  of  overhead  computations  can  be 
found  in  (22). 

Because  the  algorithm  is  lazy,  it  can  in  fact  outperform 
the  optimal  algorithm  for  trees  in  many  cases.  Unlike  the 
standard  optimal  algorithm  which  assumes  that  all  attri¬ 
butes  must  be  up  to  date  after  each  transaction,  this  al¬ 
gorithm  only  updates  attributes  which  must  be  evaluated 
in  order  to  ensure  all  declared  constraints  are  met  and  ail 
user  requested  values  are  available.  The  computation  of 
attribute  values  which  are  not  directly  or  indirectly  ob¬ 
servable  from  outside  the  system  is  deferred.  This  can  re¬ 
sult  in  very  significant  savings.  Even  without  these  sav¬ 
ings.  the  algorithm  vastly  outperforms  conventional 
trigger  based  systems  [4],  since  its  performance  is  never 
worse  than  linear,  whereas  triggers  can  exhibit  exponen¬ 
tial  behavior  in  many  cases.  This  linear  behavior  is  crucial 
in  applications  such  as  software  environments  where  long 
chains  of  attribute  dependencies  are  required. 

The  Cactis  incremental  update  algorithm  works  in  two 
phases.  The  first  phase  determines  what  derived  data  is 
potentially  affected  by  a  change,  and  the  second  phase 
reevaluates  the  subset  of  that  data  which  must  actually  be 
recomputed.  During  the  first  phase,  data  which  might  be 
directly  or  indirectly  affected  by  a  change  is  marked  out- 
of-date  using  a  traversal  over  a  graph  which  represents  the 
dependencies  between  attributes  at  the  data  level.  During 
this  traversal,  certain  attributes  will  be  encountered  which 
are  designated  important.  These  are  the  attributes  which 
the  system  must  ensure  always  have  correct  values 
Whenever  an  important  attribute  is  marked  out-of  date  in 
the  first  phase  of  the  algorithm,  it  is  remembered  for  the 
second  phase.  The  second  phase  of  the  algorithm  recur- 
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sively  recomputes  the  proper  value  of  each  important  but 
out-of-date  attribute  based  on  the  attribute  evaluation  rule 
attached  to  the  attribute.  Since  the  second  phase  operates 
in  a  demand  driven  way,  it  can  be  lazy.  Consequently,  it 
avoids  evaluating  attributes  which  are  not  actually  needed. 
As  a  result,  only  the  optimal  set  of  attributes  is  actually 
recomputed  after  a  change  (although  nonoptimal  overhead 
is  incurred  to  find  these  attributes). 

In  addition  to  efficient  algorithmic  techniques,  the  Cac- 
tis  system  also  uses  self-adaptive  heuristic  techniques  to 
improve  disk  access  performance  over  time.  Statistics 
about  past  behavior  are  collected  to  be  used  as  a  predictor 
of  future  behavior.  These  statistics  are  used  to  cluster  data 
which  is  frequently  accessed.  This  results  in  significant 
performance  gains.  In  addition,  these  statistics  are  used 
to  schedule  work  to  be  done  in  an  efficient  manner.  As 
described  above,  the  incremental  attribute  evaluation  al¬ 
gorithm  is  a  pair  of  graph  traversals.  Because  of  the  func¬ 
tional  nature  of  the  derivation  rules  used  in  Cactis.  these 
traversals  can  proceed  in  a  number  of  different  orders. 
They  could  be  done  depth-first,  breadth-first,  or,  as  in 
Cactis.  in  an  order  which  past  behavior  indicates  will  be 
efficient.  In  general  the  system  schedules  work  which  is 
expected  to  incur  the  least  new  disk  accesses  first.  This 
preserves  or  frees  buffer  space  for  later  work,  thereby  at¬ 
tempting  to  reduce  disk  accesses  due  to  thrashing. 

Extensive  performance  tests  have  been  performed  on 
Cactis  [23].  These  tests  indicate  the  self-adaptive  cluster¬ 
ing  and  scheduling  can  save  as  much  as  60  percent  of  disk 
accesses  for  databases  that  contain  extensive  amounts  of 
derived  data.  Even  for  databases  with  little  derived  infor¬ 
mation.  a  3  or  10  percent  savings  in  disk  accesses  is  usu¬ 
ally  realized. 

III.  Supporting  Software  Environments 

In  this  section  we  describe  on-going  experiments  in¬ 
volving  Cactis.  Various  aspects  of  a  software  environ¬ 
ment  are  being  coded  as  Cactis  applications.  We  are  at¬ 
tempting  to  construct  these  toots  in  a  uniform,  integrated 
fashion.  We  feel  that  the  data  model  supported  by  Cactis 
is  a  natural  mechanism  for  specifying  the  sons  of  data 
needed  by  a  software  environment.  Our  hope  is  that  our 
experiments  will  serve  three  purposes.  First,  we  hope  to 
validate  our  claim.  Second,  we  plan  to  illustrate  that  Cac¬ 
tis  can  support  software  environment  data  efficiently.  And 
finally,  we  will  evolve  the  implementation  of  our  DBMS 
according  to  the  experimental  results  we  obtain.  Clearly, 
we  will  discover  significant  ways  in  which  Cactis  can  be 
improved. 

Users  of  a  software  environment  must  perform  many 
data  manipulations  that,  unlike  those  of  conventional  da¬ 
tabase  applications,  involve  complex  derived  data.  Cactis 
provides  a  unified  framework  for  manipulating  derived  in¬ 
formation  in  a  software  environment  at  multiple  levels  of 
granularity.  In  this  section,  we  illustrate  this  capability  by 
discussing  first  derived  data  at  the  level  of  modules  and 
whole  programs,  then  at  the  level  of  statements  or  expres¬ 


sions  within  a  program,  and  then  finally,  across  multiple 
programs. 

There  have  been  numerous  research  projects  aimed  at 
supporting  derived  information  in  software  environments. 
Notable  examples  are  the  work  in  version  control  [10], 
[40].  [47],  [49]  and  bug  tracking  [31].  Compared  to  these 
efforts,  our  research  differs  significantly  in  its  goals.  We 
are  not  primarily  concerned  with  defining  the  functional 
requirements  and  developing  the  capabilities  of  these  fa¬ 
cilities.  Rather,  we  are  concerned  with  the  ability  of  cur¬ 
rent  database  technology  to  provide  useful  support  for 
such  systems.  In  particular,  we  focus  on  the  application 
of  object-oriented  data  models  to  the  support  of  software 
development,  and  on  the  efficient  maintenance  of  disk- 
based  data.  In  sum,  our  work  is  an  attempt  to  provide 
effective,  underlying  database  support  for  software  engi¬ 
neering  tools. 

A.  Programming  in  the  Large 

Objects  used  by  the  Cactis  system  are  declared  using 
the  Cactis  data  language.  In  this  language  one  may  de¬ 
clare  both  object  and  relationship  classes.  A  relationship 
class  declaration  provides  names  and  types  for  the  values 
that  flow  between  related  objects  and  indicates  the  direc¬ 
tion  of  this  flow.  An  object  class  provides  names  and  types 
for  a  set  of  attribute  values  and  provides  the  name,  class, 
and  direction  of  each  potential  relationship. 

As  a  simple  example,  we  might  wish  to  create  objects 
for  use  in  a  simple  bug/fix  tracking  system.  Fig.  2  shows 
the  declarations  we  might  use  for  such  objects.  Here  sev¬ 
eral  relationship  classes  have  been  defined.  These  include 
bug_ftx.  which  will  relate  a  bug_report  to  a  fix_report. 
modulejjug  which  will  relate  a  bug_report  to  a  module, 
and  several  others.  Notice  that,  in  order  to  give  direction 
to  relationships,  one  end  of  the  relationship  is  called  a 
plug,  and  the  other  a  socket.  Any  object  which  possesses 
a  plug  of  a  given  class  may  be  related  to  any  object  which 
possesses  a  socket  of  that  class.  In  some  cases,  relation¬ 
ships  have  been  declared  as  Multi  Plug  or  Multi  Socket. 
This  indicates  a  one-to-many  relationship  where  multiple 
plugs  (sockets)  may  attach  to  a  single  socket  (plug).  Re¬ 
lationships  also  allow  values  to  flow  between  objects.  In 
the  case  of  a  bug_fix  relationship,  one  Boolean  value 
(is_fixi  - .,  is  transmitted  from  the  plug  end  of  the  relation¬ 
ship  to  .he  socket  end.  In  cases  where  a  socket  of  class 
bug_fix  is  left  unconnected,  a  value  of  false  is  transmitted 
by  default. 

A  bug_report  object  can  be  related  to  a  number  of  other 
objects  as  indicated  by  the  relationships  section  of  the 
declaration.  A  bug_report  is  related  to  the  module  which 
the  bug  occurs  in  by  the  in_module  relationship.  It  may 
also  be  related  to  a  fix_report  via  the  fixed_by  relation¬ 
ship,  to  a  test  result  indicating  the  symptom  via  the  symp¬ 
tom  relationship,  and  finally,  to  objects  representing  proj¬ 
ect  personnel  via  the  reported_bv  and  assignedjo 
relationships.  Finally,  a  bug_report  object  has  two  attri¬ 
butes:  report_words  which  contains  a  textual  description 
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Rtfationskip  Clan  ooq  rix 
Transmits 

isjhcnd  :  aoofaan  To  Socket.  Default  •  False; 

End  IkMaulup: 

RefaUanskip  Class  moOul«_Ouq  Multi  Plug  End  Relationship; 

Relationship  Class  tojMt  JMUlt  Multi  Plus  End  Relationship; 

Relationship  Class  lo  jssrson  Multi  Plug  End  Relationship; 

Relationship  Class  worX  JO  .person  Multi  Socket  End  Relationship; 

Object  Class  bug.repoit 
Relationships 

in.moduio  :  moouie.oug  Plug; 

symptom  :  to  jest  Jesuit  Plug; 

lixedjiy  :  Big Jix "socket; 

reported  .by  :  to  .person  Plug; 

assigned.to  wont  Jo  jjerson  Socket; 

Attributes 

report.words  :  text.stnng; 

isjixed  .  Boolean; 

Rules 

isjixed :» fixod.by.isjixed: 

End  Object; 

Object  Class  tix  jeport 
Relationships 

fixes  :  bugjix  Plug; 

fixed  J>y  :  to  .person  Ptug; 

ertange  :  to.delta  Plug. 

Attributes 

reoort.words  :  text.stnng; 

Rules 

tixes.isjixed  :»  True: 

End  Object; 

Fig.  2.  Objects  lorbug/tix  tracking  system. 

of  the  bug,  and  is.hxed  which  indicates  if  the  bug  has 
been  corrected. 

Even  though  the  definitions  given  in  Fig.  2  are  oversim¬ 
plified.  they  can  be  used  to  illustrate  several  important 
capabilities  of  the  Cactis  system.  First,  we  can  see  from 
this  example  that  diverse  but  interelated  data  from  differ¬ 
ent  aspects  of  the  development  process  can  be  unified  in 
a  single  data  model.  In  our  example,  three  different  kinds 
of  data,  management  data  involving  project  personnel, 
data  representing  actual  program  code,  and  data  repre¬ 
senting  test  results,  have  been  integrated  in  order  to  im¬ 
plement  a  fourth  aspect  of  the  environment,  a  bug/fix 
tracking  system. 

Second,  we  note  that  the  Cactis  system  is  able  to  deal 
with  large  or  unstructured  types  maintained  by  external 
tools.  For  example,  the  report.words  attributes  of  both 
bug_reporxs  and  fix.reports  is  declared  to  be  of  type 
text.string.  Because  most  of  the  details  of  type  imple¬ 
mentation  are  hidden  from  the  system,  this  type  could  be 
implemented  as  an  internal  identifier  which  can  be  used 
by  an  external  file-based  string  package  to  retrieve  a  large 
siring  from  a  special  string  file.  Only  the  evaluation  rules 
that  deal  with  this  type  need  be  aware  of  its  actual  imple¬ 
mentation.  This  ability  of  integrating  private  data  man¬ 
aged  by  other  tools  is  very  important  to  the  flexibility  of 
a  software  environment,  [n  particular,  this  capability  is 
essential  for  dealing  with  object  files  or  load  modules. 
This  data  typically  must  be  maintained  and  controlled  by 
host  operating  system  tools  such  as  linkers  and  loaders. 

Finally,  we  can  see  how  derived  data  is  used,  and  how 
the  extensibility  and  subtyping  capabilities  of  the  system 


enhance  its  utility.  The  example  uses  derived  data  in 
very  limited  way  by  computing  the  is. fixed  attribute  o' 
objects  of  type  bug_report.  In  this  case,  a  fix.repon  ob¬ 
ject  transmits  this  value  across  a  bug_fix  relationship  to  .1 
bug_report  object.  When  no  such  relationship  exists.  . 
value  of  false  is  supplied  by  default. 

A  more  interesting,  but  still  simple,  example  of  denvec 
data  is  given  in  Fig.  3.  Here  we  have  extended  our  ex 
ample  to  incorporate  scheduling  information.  A  system  01 
objects  is  used  to  schedule  work  to  be  done  on  the  basi> 
of  milestones.  A  milestone  object  is  given  a  target  com¬ 
pletion  date  and  derives  an  expected  completion  data  front 
the  expected  completion  date  of  other  milestones  it  de 
pends  on.  along  with  an  estimate  of  time  required  for  work 
on  the  modules  local  to  the  milestone.  As  a  consequence, 
a  milestone  object  may  automatically  derive  an  attribute 
which  indicates  how  late  it  is  expected  to  be.  This  value 
can  then  be  used. to  derive  a  priority  to  be  placed  on  work 
for  each  module  (or  optionally  this  could  be  done  man 
uaily  with  a  scheduling  tool).  This  priority  can  then  be 
transmitted  along  the  to_pri_module  relationship  defineo 
in  Fig.  3  to  a  pri_bug_  report  object. 

Note  that  the  pri_bug_report  class  is  a  subtype  01 
bug_report.  This  indicates  that  it  inherits  all  the  relation¬ 
ships  and  attributes  of  bug.report.  In  addition,  we  have 
added  a  new  relationship  and  two  new  attributes.  The  se¬ 
verity  attribute  now  allows  the  bug  to  be  weighted  ac¬ 
cording  to  how  severe  it  is.  The  bug_priority  attnbute 
computes  a  priority  for  the  bug  on  the  basis  of  the  priority 
computed  for  (or  assigned  to)  the  relevant  module  along 
with  the  seventy  of  the  bug.  Note  that  this  computation 
can  be  done  in  a  lazy  fashion.  If  the  bug  has  already  beer, 
fixed,  a  pnonty  of  0  is  automatically  defined,  and  no  val¬ 
ues  need  to  compute  a  module  pnonty  are  requested  or 
recomputed.  This  is  an  example  of  where  a  computation 
involving  a  long  chain  of  dependencies  can  be  handlec 
efficiently  and  automatically  when  needed,  but  is  avoidec 
when  it  is  not  needed.  As  an  additional  feature,  the  value 
assigned  to  the  bug_pnority  attnbute  could  also  be  usee 
to  define  inclusion  in  a  subtype.  For  example,  one  coulc 
define  the  subtype  hot_bug  as  being  all  objects  of  the 
pri_bug_report  class  which  have  a  bug_pnontv  greater 
than  some  threshold.  The  use  of  this  kind  of  denved  sub- 
type  could  make  tt  easier  to  retneve  specific  information 
regarding  trouble  spots  in  an  ongoing  project. 

An  important  feature  of  the  Cactis  system  is  extensibil¬ 
ity.  The  addition  of  new  functionality  to  an  object  doe^ 
not  require  that  existing  tools  be  modified.  For  example, 
any  tool  which  uses  a  bug_report  object  can  also  trans¬ 
parently  use  a  pn_bug_report  object.  The  object  onenteu 
nature  of  the  Cactis  data  model  allows  the  implementation 
details  of  an  object  to  be  hidden,  and  allows  a  compatible 
external  interface  ro  be  r  'lined  when  classes  are  ex¬ 
tended.  Further,  since  relationships  are  defined  on  the  ba 
sis  of  the  values  transmitted  into  and  out  of  objects  rather 
than  on  the  basis  of  object  classes  themselves,  a  single 
object  can  be  replaced  by  a  group  of  related  objects.  One 
must  only  provide  a  compatible  set  of  plugs  and  sockets 
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Object  Qua  milestone 
Rdilionsbipa 

o*p#nd*_on :  milastondjjap  Plug; 

Attributed 

;arq«t_compi.  iocai_work.  exp_compl,  lateness  :  :ime: 

RuJci 

axo_comol  :»  ... 

lateness  :»  axp_compl  -  taiget_comol; 

End  Object: 

Relationship  Qaa  to  jjn_moduia 
Transmits 

modula _pnomy  :  sen  ad jjnomy  To  Plug.  Default  »  0: 

Multi  Plus 
End  Relationship: 

Object  Class  pn_dug_raport 
Subtype  of  Dug_rapon: 

Rctationsbips 

in  _pn_mod  :  to _prt_moduie  Plug; 

Attributes 

tXKj_pnonty  :  sen  ad jjnonty;  -  ' 

seventy  :  Oug_saventy; 

Rules 

nug_pnonty 

»  [fiSJixed  Then  0 

Else  assign _pnonry(in _pn_mod.modula  jjnonty.  seventy); 
End  Object; 


Fig.  3.  Extended  bug/tix  tracking  objects. 

B.  Programming  in  the  Small 

In  the  previous  section,  examples  of  manipulating  de¬ 
rived  data  at  the  level  of  modules  and  at  the  level  of  a 
schedule  for  a  whole  project  were  considered.  One  of  the 
important  features  of  the  Cactis  data  model  is  that  it  can 
also  deal  with  very  fine  grained  structures  using  the  very 
same  mechanism. 

To  illustrate  this.  Figs.  4,  5.  and  6  define  a  series  of 
objects  for  representing  programs  as  abstract  syntax  trees. 
These  sorts  of  structures  are  typical  of  the  tree  structures 
used  as  intermediate  code  for  a  compiler  (48]  or  for  rep¬ 
resenting  programs  m  a  syntax  directed  editor  [46],  In  this 
case  we  have  modeled  a  simple  language  containing  se¬ 
quences  of  assignments,  while  loops,  and  if-then-else 
statements.  These  are  modeled  by  the  object  classes  as- 
sign_stmt.  while_stmt,  if  stmt.  sequence,  and  emp¬ 
ty  _stmt.  In  addition,  a  number  of  object  classes  for  rep¬ 
resenting  expressions  and  other  constructs  wouid  be 
required  but  are  not  shown.  These  objects  can  be  com¬ 
bined  to  form  trees  using  the  stmt_dataflow  and  expr_da- 
taflow  relationships  shown  in  Fig.  4. 

Going  beyond  objects  which  simply  represent  pro¬ 
grams.  we  have  also  included  attributes  and  evaluation 
rules  which  can  derive  important  information  about  the 
program  automatically  (these  object  definitions  are 
adapted  from  an  attribute  grammar  given  in  (14j).  In  this 
case  we  compute  dataflow  information  which  indicates  the 
liveness  of  variables  in  the  program.  A  variable  is  said  to 
be  live  at  a  given  program  point  if  its  value  could  be  used 
at  some  point  later  in  the  program  for  some  potential  ex¬ 
ecution  of  the  program.  Variables  whose  valr-  at  a  given 
point  in  a  program  cannot  be  used  later  in  the  program  are 
said  to  be  dead.  This  kind  of  information  can  be  used  to 
detect  potential  errors  in  a  program  or  procedure  [3],  (I6|, 


RttMfeMMe  CUI  MlTH  __  -.lUflOW 

iveom  varxet  To  Pint;  '  van  nve  on  oxx  from  stmt 

ivovi  verse*  To  Socxm:  '*  v* n  «*v#  on  entry  lo  stmt  ’’ 

os*  vo/sot  To  Sodui:  .•  vers  used  beioretet  in  amt 

turn  verse*  To  Sodw*:  .*  vers  not  sot  on  somo  pom  tnrougn  stmt  " 

End  Bi— noniOip; 

Rftuionwio  Clam  onofjMUSOvr 
TranODM* 

use  -  verse*  To  Socfcot:  r  van  used  in  eipr  *' 

End  Ro*auens*Hp; 


OHioei  ClMtatjjoti-  prototype  lor  omeci  using  aetenow  v 
Rmuoniaioa 

aafom :  simtjMunow  Plot 
Attributes 

UVEIN  vanai:  r  van  ivo  on  owy  to  stmt  V 
UVEOUT  vanai:  i'  van  ivo  on  oxit  trom  stmt  v 

UV60UT  ■  oannt.iivoow: 
oarent.evetn  ■  uvein: 

End  Object: 


Fig.  A.  Declarations  lor  liveness  analysis. 


Object  Clan  assign_stmt 
/"  <stmi>  :»  ID  <expr>  7 
Subtype  of  df_0B|; 

Relationships 

asn_axor :  expr_datallow  Socket;  /*  expression  assigned  7 
Attributes 

id  :  vand:  /'  var  assigned  to  7 
Rules 

UVEIN  :»  (UVEOUT  -  (id))  u  asn_expr.us«; 
parent. use  :«  asn  axor.usa: 
parsnt.thru  :»  allj/ars  -  (id); 

End  Object; 

Object  Class  if_stmt 

r  <stmt>  IF  <expr>  THEN  <slm«Jist>  ELSE  <stmtjist>  END  7 
Subtype  of  df__obj; 

Relationships 

COnd_expr  :  axpr_dataffow  Socket:  /*  conditional  axpr  7 
stmtl  :  stmtjjataflow  Socket;  r  Ibon  clause  7 

Stmt2  :  st mt_da tallow  Socket;  /*  also  clause  7 
Rules 

UVEIN  >  cond_axpr.usa  u  stmtl  .livoin  o  stmt2.llvein: 
parartt.usa  •»  cond_expr.use  u  stmtl  use  w  stmt2.use: 
parent.lhru  :»  stmtl. thru  w  stmt2.thnj; 
simtl  liveout  :»  UVEOUT; 
stmt2.liveout  UVEOUT; 

End  Object: 

Object-Class  wnile_stmt 

r  cstmt>  WHILE  <axpn>  DO  <stmt_list>  END  7 
Subtype  of  df_OD|: 

Rctationsbips 

cortd_axpr  .  axor_dataflow  Socket;  r  conditional  axpr  7 
stmtl  :  simt_dataflow  Socket;  r  loop  body  7 

Rules 

UVEIN  :»  cond_axpr.uso  u  stmtl  .livoin  t_i  UVEOUT ; 
paront.usa  :»  cond_axpr.use  u  stmtl  use; 
param.ttmj :»  al)_vars; 

stmtl  .liveoul  :■  cond_expr.use  \j  stmtl  use  w  LIVEOUT; 

End  Object; 


Fig.  5.  Classes  tor  computing  liveness  ipart  I). 

[38j.  For  example,  if  a  local  variable  of  a  procedure  is 
live  at  the  start  of  the  procedure,  then  there  is  a  potential 
execution  path  along  which  the  variable  may  be  used  be¬ 
fore  it  is  assigned  a  value.  This  indicates  a  potential 
anomaly  in  the  code.  This  anomaly  information  can  then 
be  used  by  other  pans  of  the  system  to  derive  other  infor¬ 
mation.  However,  the  details  of  how  this  information  is 
used  is  hidden.  Consequently,  it  is  possible  to  add  new 
objects  or  new  tools  which  use  this  information  without 
reimplementing  existing  objects. 
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Object  Ctaa  Mquanca 
r  <swit>  ',•*  <stmt_list>  7 

Subtypeof  dt_O0j; 

RttatHMubip* 

stmtl  :  stmtjlaialtow  Socket;  r  first  of  secuencs  7 

Stmt2  :  swnt_dataflow  Socket;  r  rest  of  sequence  7 

Rules 

LIVEIN  :» stmtl.iivajn; 

parsm.uso  :»  stmtl  .uso  u  (stmt2.uso  m  stmtl  .tftnj); 
garant.ttvtj stmtl  .tnru  stmt2.tltru; 
stmtl  .livaout :»  stmtZlivein; 
stmtaiivoout UVEOUT; 

End  Object: 

Object  Class  #moty_stmt 
/*  <stmtjist>  EMPTY  7 
Subtype  of  dt_00j; 

Rules 

UVEIN  UVEOUT; 
parsm.uso  :»  empty _sat; 
parent  thru  :■  all_vars; 

End  Object; 


Fig.  6.  Classes  for  computing  liveness  (part  1). 

In  order  to  compute  liveness  information,  we  introduce 
two  attributes,  UVEIN  and  UVEOUT  which  represent  the 
set  of  variables  which  are  live  on  entry  to  and  exit  from  a 
statement,  respectively.  These  attributes  are  defined  as  a 
part  of  the  df_obj  class  given  in  Fig.  4.  In  addition,  we 
also  compute  but  do  not  store  the  values  of  two  other  sets: 
use  which  indicates  the  variables  used  before  they  are 
reassigned  in  a  statement,  and  thru  which  indicates  the  set 
of  variables  which  are  not  assigned  along  some  potential 
path  through  the  statement.  These  sets  are  transmitted  be¬ 
tween  objects  using  the  stmt  dataflow  relationship  for 
statements  and  the  expr dataflow  relationship  for  expres¬ 
sions  (here  we  assume  expressions  have  no  side  effects). 

The  liveness  computation  is  an  example  of  a  backward 
dataflow  problem.  That  is.  it  proceeds  in  the  direction  op¬ 
posite  to  control  flow.  In  this  case  we  use  an  initial  value 
of  LIVEOUT  at  the  end  of  a  procedure  to  calculate  the 
value  of  LIVEIN  at  the  beginning  of  the  procedure  along 
with  LIVEOUT  and  LIVEIN  at  each  point  in  between. 
For  example,  the  variables  live  before  an  assignment 
statement  include  all  the  variables  used  in  the  expression 
being  assigned,  along  with  all  the  variables  live  after  the 
assignment  except  the  variable  assigned  to.  Translating 
this  into  set  notation  we  obtain  the  assign_stmt  evaluation 
rule  given  in  Fig.  5: 

LIVEIN  :=  (LIVEOUT  -  {id})  U  asn_expr.use: 

Similarly,  the  values  live  before  a  while  loop  include  those 
used  in  the  loop  test  expression  and  those  live  at  the  start 
of  the  body  of  the  loop,  as  well  as  ail  of  those  live  after 
the  loop  (since  the  loop  may  execute  zero  times).  In  set 
notation: 

LIVEIN  :=  cond  e.xpr.use  U  stmtl. livein 
U  LIVEOUT; 

In  the  examples,  we  have  used  set  notation  for  clarity. 
In  the  real  specification  used  by  the  Cactis  system  this 
would  be  replaced  by  function  calls  implementing  set  op¬ 


erations.  In  general,  the  rules  given  compute  the  thru  a; 
use  sets  on  the  basis  of  their  children  and  local  inform, 
tion,  then,  given  a  value  for  LIVEOUT  from  their  parer 
are  able  to  compute  the  value  of  LIVEIN.  The  exact  i 
dering  of  these  computations  is  only  partially  define 
Within  this  partial  order,  computations  are  conceptual 
performed  concurrently.  At  present,  the  computations  a 
performed  in  an  order  expected  to  minimize  disk  acce.s 
The  system  is  also  being  extended  so  that  these  comp 
tations  can  actually  be  performed  in  parallel  as  discuss< 
in  Section  IV. 

The  computations  we  have  defined  in  Figs.  4.  5.  and 
do  not  involve  cyclic  dependencies.  However,  most  d 
taflow  problems  are  more  easily  characterized  as  the  s< 
lution  to  a  fixed  point  problem  and  hence  involve  cvcl 
attribute  dependencies.  The  solution  of  such  fixed  poi 
problems  using  attribute  grammars  is  considered  in  (14 
In  this  work,  an  appropriate  class  of  circular,  but  we 
defined  attribute  grammars  is  defined,  and  simple  coni 
tions  are  stated  for  guaranteeing  termination  of  the  r 
suiting  cyclic  attribute  computations.  Roughly  speakin 
these  conditions  require  that  the  evaluation  functions  ; 
volved  be  monotonic,  and  that  the  attributes  involv 
come  from  finite  domains.  Under  these  conditions,  a  leu 
fixed  point  solution  can  always  be  found  by  success^ 
approximation  (i.e..  iteration  until  convergence).  The: 
conditions  extend  straightforwardly  to  the  attribuu 
graphs  used  by  the  Cactis  system. 

The  second  phase  of  the  Cactis  incremental  update  ; 
gorithm  can  be  modified  slightly  to  handle  cyclic  but  wt 
defined  attribute  systems.  While  a  value  is  being  reev; 
uated  in  the  evaluation  phase,  it  is  given  a  special  in-pro 
ress  mark.  If  such  a  mark  is  encountered  during  evalu 
tion.  a  cycle  exists.  Such  marks  are  used  to  identify  „ 
strongly  connected  components  of  the  dependency  grap; 
These  can  then  be  used  to  effectively  compute  an  iteratr 
solution  to  the  fixed  point  problem.  A  related  algomh 
for  incremental  evaluation  of  fixed  points  in  convention; 
attribute  grammars  is  also  studied  in  [25].  However,  th 
algorithm  is  based  on  the  optimal  update  algorithm  t 
trees  and  does  not  extend  to  attributed  graphs. 

It  should  be  noted  that  the  Cactis  system  may  not  pe 
form  well  enough  in  some  situations  to  support  ail  aspec 
of  programming  in  the  small  in  a  practical  manner.  In  pa 
ticular.  it  is  currently  unknown  if  interactive  edit-time  si. 
mantic  tests  would  really  be  practical.  If  related  progra 
components  are  spread  across  many  disk  blocks  and  : 
termixed  with  other  objects,  the  time  to  simply  fetch  . 
the  required  disk  blocks  would  be  too  long  in  itself.  Hov 
ever,  if  the  objects  in  question  were  placed  together  i 
disk  blocks,  and  their  totai  size  was  close  to  that  wha 
would  fit  in  memory,  adequate  performance  could  pro. 
ably  be  achieved.  We  are  currently  exploring  mechanist 
tor  clustering  of  data  which  should  allow  good  perti 
mance  for  programming  in  the  small  problems.  Howevc 
only  further  experiments  with  the  system  will  indica 
whether  the  system  will  actually  be  fast  enough  to  suppn 
this  kind  of  problem. 
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C.  Version  Retrieval 

As  we  have  seen  in  the  two  previous  subsections,  the 
mechanisms  of  Cactis  may  be  used  to  manipulate  derived 
data  at  both  the  levei  of  single  programs  and  at  the  level 
statements  or  expression  within  a  program.  In  a  software 
environment,  it  is  also  necessary  to  manage  groups  of 
programs  and  metaobjects  which  are  responsible  for  man¬ 
aging  and  organizing  other  objects.  There  are  several  con¬ 
texts  in  which  this  is  done.  For  example,  an  engineer  may 
need  to  differentiate  different  versions  of  a  related  set  of 
modules.  The  Cactis  data  model  allows  objects  to  be  cre¬ 
ated  which  group  together  such  a  set  of  modules  and  form 
the  basis  for  a  version  control  tool  which  would  allow 
groups  of  modules  to  be  checked  out.  modified  and 
checked  in  as  a  new  version.  In  such  a  system,  different 
versions  of  a  program  are  normally  not  explicitly  stored, 
but  rather  derived  from  a  current  version  through  some 
delta  mechanism.  In  the  Cactis  system  the  delta  infor¬ 
mation  needed  to  recover  old  versions  can  be  compact, 
and  can  itself  be  modeled  as  a  set  of  objects.  Because  of 
the  nature  of  the  data  modei  used,  the  delta  mechanism 
can  also  be  efficient  and  straightforward  to  implement. 

Because  a  single  change  to  a  Cactis  data  object  can 
cause  derivations  of  new  data  arbitrarily  far  from  the  point 
of  direct  change,  it  seems  that  a  delta  mechanism  would 
be  difficult  to  implement.  This  is  not  the  case.  To  under¬ 
stand  why  a  Cactis  supported  delta  mechanism  can  in  fact 
be  very  simple,  we  can  make  a  simple  observation.  Be¬ 
cause  all  attribute  evaluation  rules  are  functional  in  na¬ 
ture.  if  a  user  modifies  an  object  by  changing  an  attribute 
value,  the  entire  system  can  be  restored  to  its  original  state 
simply  by  restoring  the  old  value  of  the  attribute.  The 
same  mechanism  which  automatically  derives  new  data 
based  on  changes  can  also  be  used  to  automatically  un¬ 
derive  those  changes .  This  observation  also  extends  to  struc¬ 
tural  changes  as  well  as  multiple  changes  made  together. 
Even  though  a  single  change  may  have  wide  ranging  ef¬ 
fects  in  a  database,  only  the  data  directly  changed  by  the 
user  needs  to  be  stored  in  order  to  reverse  those  effects. 
This  allows  a  very  straightforward  undo  mechanism  that 
can  be  used  to  reverse  changes  within  a  session,  regard¬ 
less  of  how  the  changes  were  done.  Consequently,  one 
need  not  build  an  undo  mechanism  into  each  tool,  but  can 
use  the  general  mechanism  provided  by  the  system.  In 
addition,  this  same  capability  can  also  provide  an  efficient 
delta  mechanism  by  keeping  objects  representing  edit  op¬ 
erations  and  old  values. 

IV.  Directions 

Cactis  is  an  operational,  multiuser  DBMS.  It  consists 
of  about  70  000  lines  of  C  code,  and  runs  in  the  Berkeley 
UNIX  environment.  The  system  currently  only  provides 
centralized  storage,  but  supports  concurrent  access  by 
multiple  users  via  a  timestamping  concurrency  control 
technique. 

The  system  is  currently  being  extended  from  the  current 
centralized  implementation  to  a  distributed  implementa¬ 
tion  suitable  for  use  in  a  distributed  workstation  environ- 
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Fig.  7.  Placing  data  on  different  hosts. 


ment.  Properties  of  the  Cactis  data  model  make  this  tran¬ 
sition  particularly  easy.  Recall  that  the  system  only  defines 
a  partial  order  on  computations  and  already  uses  an  un¬ 
predictable  evaluation  order.  To  distribute  the  system,  it 
is  easy  to  (conceptually)  insert  a  pair  of  special  commu¬ 
nication  objects  between  any  two  related  objects  as  shown 
in  Fig.  7.  Because  the  interface  between  two  related  ob¬ 
jects  consists  solely  of  the  type  and  number  of  values 
transmitted  across  the  relationship,  the  actual  transmis¬ 
sion  medium,  whether  local  to  a  single  machine  or  across 
a  network,  is  transparent.  The  only  modification  to  the 
sequential  evaluation  algorithm  that  is  needed  is  synchro¬ 
nization  between  the  first  and  second  phases.  The  entire 
first  phase  must  be  complete  before  any  computations  in 
the  second  phase  may  begin.  The  existing  concurrency 
control  system  will  work  in  both  the  centralized  and  dis¬ 
tributed  systems. 

A  further  advantage  of  the  Cactis  data  model  is  that  it 
will  be  able  to  automatically  manage  replication  of  data 
in  the  distributed  environment.  With  some  small  changes 
to  the  incremental  update  algorithm,  replicated  copies  of 
data  can  be  treated  as  a  form  of  derived  data.  Since  the 
incremental  update  algorithm  is  lazy,  this  system  will 
amount  to  lazy  replacement  of  replicated  copies,  with  all 
book  keeping  and  coordination  handled  automatically  by 
the  evaluation  system. 

Finally,  due  to  the  large,  complex  nature  of  derived  data 
found  in  software  environments,  a  software  database  must 
support  unusual  forms  of  transaction  specification.  An¬ 
other  direction  of  current  research  is  support  for  these  un¬ 
usual  requirements— particularly  support  for  long  and 
nested  transactions.  Typically,  a  designer  will  checkout  a 
group  of  modules,  work  on  them,  and  check  them  back 
in.  This  may  entail  a  long  interactive  database  transac¬ 
tion,  during  which  the  designer  works  on  local  versions 
of  the  modules.  They  would  then  be  checked  back  in  as 
new  versions,  and  modules  and  object  code  may  be  formed 
which  use  them.  Clearly,  database  transactions  in  a  soft¬ 
ware  environment  are  typically  longer  than  standard, 
business-oriented  transactions.  They  are  also  more  com¬ 
plex.  tending  to  spawn  subtransactions  which  perform 
subtasks,  such  as  compiling  a  program  or  relinking  a  load 
module.  Software  environment  databases  therefore  re¬ 
quire  very  dependable  rollback  mechanisms.  Again,  the 
simple  mechanisms  described  in  the  last  section  for  undo 
can  be  used  to  perform  this  rollback. 
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Long  transactions  may  also  necessitate  more  powerful 
and  flexible  constraint  mechanisms  than  are  typically 
found  in  conventional  database  systems.  Constraints  can 
themselves  be  described  as  a  form  of  derived  data.  In  a 
software  application,  designers  may  want  the  capability 
to  specify  complex  constraints.  An  example  would  be  that 
the  versions  which  make  up  a  load  module  be  compatible 
in  terms  of  the  parameters  they  pass.  Such  constraints  can 
easily  be  computed  in  the  Cactis  system  as  Boolean  val¬ 
ued  attributes.  This  form  of  attribute  would  use  a  predi¬ 
cate  representing  the  constraint  as  its  evaluation  rule.  The 
actual  enforcement  of  constraints  in  the  system  can  be 
flexible.  Constraints  may  be  tied  to  the  termination  of  a 
transaction  or  the  checkin  point  of  a  group  of  modules.  In 
order  to  give  a  software  designer  tight  control  over  the 
accuracy  of  a  long,  interactive  transaction,  constraints 
may  also  be  enforced  at  the  user's  request  or  checked  con¬ 
tinuously. 

There  are  a  number  of  features  of  Cactis  which  are  not 
completed.  At  this  point,  nested  transactions  are  not  sup¬ 
ported,  and  cyclic  attribute  computations  are  not  handled. 
Also,  an  effective  software  environment  must  support  an 
interactive  merge  facility,  whereby  two  versions  of  a 
module  which  were  derived  in  paralle!  may  be  merged 
into  one  consistent  module.  Database  support  is  clearly 
needed  for  this  facility.  Also,  Cactis  does  not  yet  support 
incremental,  run-time  schema  changes.  This  is  likely  to 
be  a  very  serious  shortcoming.  An  interesting  question  is 
whether— as  many  have  suggested— the  traditional  dis¬ 
tinction  between  the  database  administrators  and  end-users 
is  not  valid  in  a  software  environment  database.  In  busi¬ 
ness  applications,  database  administrators  make  schema 
changes  in  the  process  of  designing  and  building  the  da¬ 
tabase  using  the  DBMS,  while  end-users  never  make 
schema  changes.  However,  in  a  softwave  environment,  it 
may  be  true  that  end-users  make  significant  schema 
changes  at  run-time,  by  performing  such  operations  as  in¬ 
troducing  new  tools  to  a  system  under  construction  or  by 
asking  for  new  forms  of  consistency  checks  and  constraint 
tests. 

Currently.  Cactis  is  being  distributed  to  a  small  number 
of  users.  In  order  to  evolve  and  fine  tune  the  implemen¬ 
tation  of  Cactis.  plans  call  for  instrumenting  the  system 
to  gather  statistics  concerning  real  life  software  environ¬ 
ment  applications.  Obviously,  the  general  properties  of 
software  databases  will  vary  from  application  to  applica¬ 
tion.  but  at  this  point  in  time,  very  little  is  known  about 
the  characteristics  of  actual  data.  Thus,  even  very  rudi¬ 
mentary  information  will  be  useful. 

The  sorts  of  information  that  we  will  collect  include  the 
following:  the  size  of  software  databases,  the  ratio  of 
schema  to  data  size,  ana  the  number  of  objects  in  typical 
types.  This  will  give  us  a  feeling  for  the  storage  require¬ 
ments  of  software  environments.  An  interesting  question 
is  whether  or  not  typical  transactions  will  retrieve  most  of 
the  data  they  need  early  (in  the  checkout  phase),  use  it. 
then  check  it  back  in.  This  would  reduce  the  problem  of 
concurrency  control.  It  would  also  indicate  that  tech¬ 


niques  developed  for  main  memory  databases  [34]  m  j 
prove  useful.  Information  about  the  depth  and  breadtr. 
dependency  graphs  among  derived  attributes,  the  patti 
of  repetition  among  dataoase  sessions,  and  the  percent 
of  set-oriented  versus  non-set-oriented  database  opt 
tions  will  help  us  determine  the  effectiveness  of  our  c: 
tering  and  scheduling  algonthms. 
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Abstract 

Cactis  is  an  object-oriented,  multi-user  DBMS  developed  at  the  University  of 
Colorado.  The  system  supports  functionally-defined  data  and  uses  techniques  based  on 
attributed  graphs  to  optimize  the  maintenance  of  functionally-defined  data. 

The  implementation  is  self-adaptive  in  that  the  physical  organization  and  the  update 
algorithms  dynamically  change  in  order  to  reduce  disk  access.  The  system  is  also  con¬ 
current.  At  any  given  time  there  are  some  number  of  computations  that  must  be  per¬ 
formed  to  bring  the  database  up  to  date;  these  computations  are  scheduled  independendy 
and  performed  when  the  expected  cost  to  do  so  is  minimal.  The  DBMS  runs  in  the 
Unix/C  Sun  workstation  environment 

Cactis  is  designed  to  support  applications  which  require  rich  data  modeling  capabili¬ 
ties  and  the  ability  to  specify  functionally-defined  data,  but  which  also  demand  good  per¬ 
formance.  Specifically,  Cactis  is  intended  for  use  in  the  support  of  such  applications  as 
VLSI  and  PCB  design,  and  software  environments. 


1.  Introduction 


Cactis  is  an  object-oriented  DBMS  which  supports  a  wide  class  of  derived  informa¬ 
tion  -  data  which  is  computed  from  functions.  In  Cactis,  a  database  object  can  have  both 
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attributes  and  integrity  constraints  which  are  functionally-defined.  Thus,  the  system  is 
useful  for  automatically  maintaining  data  which  would  normally  have  to  be  derived  by  an 
application  program.  As  an  example,  a  business  database  might  keep  track  of  objects 
called  widgets.  Each  of  these  objects  might  have  an  attribute  which  is  a  pricing  function 
that  derives  the  selling  cost  of  a  widget  in  terms  of  the  cost  of  its  parts  and  the  labor  to 
put  it  together.  There  might  be  a  constraint  that  says  one  sort  of  widget  can  never  cost 
more  than  twice  another  sort  of  widget  In  each  of  these  cases,  the  Cactis  system  would 
automatically  recompute  derived  values  whenever  necessary,  and  enforce  constraints 
whenever  updates  are  made. 

There  are  two  issues  involved  in  supporting  functionally-defined  data;  the  necessary 
conceptual  modeling  capabilities  of  the  DBMS  and  the  physical  modeling  techniques 
required  to  implement  this  model  efficiently.  In  the  case  of  the  Cactis  system,  the  con¬ 
ceptual  and  physical  models  used  by  the  database  are  identical.  Cactis  uses  an  attributed 
graph  formalism  which  generalizes  attribute  grammars  [24, 25]  and  incremental  attribute 
evaluation  techniques  [12, 35]  used  in  compilers  and  syntax  directed  editors.  In  order  to 
make  the  system  effective  in  terms  of  minimizing  I/O,  it  has  been  constructed  using  self- 
adaptive  and  concurrent  techniques. 

The  Cactis  data  model  provides  a  useful  and  simple  formalism  for  describing 
derived  information.  This  simple  formalism  has  in  turn  lead  to  a  simple  and  straightfor¬ 
ward  implementation  strategy.  Indeed  the  authors  have  found  that  this  formalism  allowed 
us  to  construct  a  large  system  very  quickly.  The  system  was  built  by  a  team  of  twelve 
students  in  about  a  year,  with  the  majority  of  the  code  being  written  by  three  of  them. 


The  authors  further  discovered  that  the  physical  data  model  used  was  conducive  to  a 
concurrent  implementation.  At  any  point  in  rime,  the  database  is  viewed  as  containing 
some  number  of  pending  computations  that  must  be  performed  in  order  to  keep  derived 
data  up  to  date.  These  computations  may  be  generated  by  a  number  of  transactions  exe¬ 
cuting  concurrently  or,  more  often,  may  be  the  result  of  multiple  subcomputations  needed 
for  a  single  transaction.  Cacris  is  also  self-adaptive,  in  that  the  system  responds  to  usage 
patterns  in  two  ways.  First,  the  Cacris  scheduler  dynamically  selects  the  next  pending 
computations  to  perform  by  deciding  which  one  will  provide  the  best  expected  perfor¬ 
mance.  The  criteria  for  selecting  the  next  computation  may  be  that  the  result  is  required 
by  a  user  or  that  the  data  objects  required  to  perform  the  update  can  be  obtained  with  lit¬ 
tle  I/O. 

The  second  self-adaptive  technique,  which  works  hand-in-hand  with  the  scheduler, 
is  the  clusterer.  This  subsystem  is  run  off-line  and  periodically  reblocks  the  database 
according  to  the  way  in  which  data  is  historically  accessed.  The  clusterer  places  two 
objects  near  each  other  if  one  is  commonly  used  to  derive  the  other.  Thus,  when  the 
scheduler  selects  the  next  pending  computation,  the  clusterer  will  have  already  minim¬ 
ized  the  I/O  effort  involved  in  performing  the  update. 

Cacris  was  designed  with  specific  sorts  of  data  in  mind.  For  example,  complex 
engineering  data  is  seen  as  a  natural  application.  In  particular,  Cacris  has  been  studied  as 
a  foundation  for  the  support  of  software  environments.  A  software  environment  is  an 
application  which  requires  the  management  of  highly  interconnected  data.  Modem 
environments  attempt  to  provide  a  facility  for  managing  the  design,  construction,  testing, 
use,  and  eventual  reuse  of  software.  One  of  the  most  important  requirements  of  a 


software  environment  is  providing  a  central  store  for  managing  the  myriad  of  objects 
which  make  up  a  software  project  and  keeping  these  objects  up  to  date  in  the  face  of  the 
many  changes  made  over  the  lifetime  of  a  project 

A  number  of  features  of  the  Cactis  system  make  it  conducive  to  such  applications. 
First  the  system  supports  the  construction  of  complex  data  and  type/subtype  hierarchies. 
This  is  necessary  in  order  to  cleanly  model  such  things  as  programs,  requirement  and 
design  specifications,  progress  and  bug  reports,  configurations,  and  documentation  which 
are  representative  of  the  data  found  in  existing  and  proposed  environments,  as  described 
in  [11,33].  Next  functionally-defined  attributes  are  also  very  useful  in  a  software 
environment  application,  as  such  a  system  might  contain  large  amounts  of  derived  data  in 
the  form  of  compilations,  cost  calculations,  and  scheduling  dependencies.  Cactis  pro¬ 
vides  a  mechanism  for  constructing  derived  data  which,  although  it  supports  a  smaller 
class  of  derived  information  than  generalized  triggers  [6],  is  much  more  efficient  than 
triggers.  The  system  also  allows  the  user  to  extend  the  type  structure,  which  is  useful  for 
adding  new  tools  to  such  a  system  without  disturbing  existing  functionality. 

The  capability  to  support  complex  data  and  type/subtype  hierarchies  discussed 
above  is  provided  by  the  subsystem  of  Cactis,  called  Sembase,  a  tool  constructed  at  the 
University  of  Colorado  (see  [15, 22]).  The  other  three  capabilities  form  a  major  recent 
development  effort.  A  brief  preliminary  report  describing  the  design  of  Cactis  appears  in 
[18].  The  system  is  now  complete  and  consists  of  approximately  65,000  lines  of  C  code, 
and  uses  a  timestamping  concurrency  control  technique. 


2.  Related  Work 


Recently,  significant  interest  has  developed  in  object-oriented  database  models,  and 
in  models  which  represent  derived  informadon.  A  large  class  of  such  models  are  com¬ 
monly  called  semantic  models.  A  complete  discussion  of  such  models  and  their  relation¬ 
ship  to  traditional  models  may  be  found  in  [20, 23].  Briefly,  traditional  database  models 
support  record-like  structures  and/or  inter-record  links  (e.g.,  the  relational,  hierarchical, 
and  network  models).  Semantic  models  support  expressive  data  relationships;  a  typical 
semantic  model  allows  a  designer  to  specify  complex  objects,  and  also  supports  at  least 
one  form  of  derived  relationship,  generalization  (sometimes  called  subtyping).  With  gen¬ 
eralization,  one  sort  of  object  can  be  defined  as  belonging  to  a  subcategory  of  a  larger 
category  of  objects.  Semantic  models  are  limited  in  the  sense  that  they  commonly  do  not 
include  support  of  methods  which  operate  on  data  objects,  as  is  typical  in  more  general¬ 
ized  object-oriented  models. 

For  a  discussion  of  a  number  of  research  efforts  directed  at  implementing  object- 
oriented  database  systems,  see  [13].  Such  systems  vary  from  extensions  to  the  relational 
model  to  handle  complex  data  [41]  to  database  implementations  based  on  the  message 
passing  paradigm  of  Smalltalk  [27, 28].  An  object-oriented  system  which  uses  persistent 
programming  techniques  is  described  in  [2].  [9, 38]  discuss  data  structures  and  access 
methods  used  to  implement  semantic  databases.  Object-oriented  implementations 
designed  to  support  extensible  databases  are  described  in  [3, 8];  these  systems  arc  toolkits 
which  allow  the  user  to  tailor  data  modeling  and  storage  mechanisms  of  a  database  sys¬ 
tem.  Another  extensible  system,  designed  for  such  applications  as  engineering,  is 
described  in  [29].  There  have  also  been  some  work  in  the  area  of  database  support  for 


software  engineering;  see  [31,42,45]. 

A  common  theme  running  through  many  of  these  projects  is  that  an  object-oriented 
system  must  be  able  to  support  a  wide  variety  of  objects  and  allow  attributes  of  objects  to 
be  derived  in  terms  of  other  data  items  in  the  database.  Other  researchers  have  stressed 
the  importance  of  derived  data  in  knowledge  based  databases  [26, 30, 36].  Much  of  the 
previous  work  in  this  area  has  come  from  AI  research  oriented  toward  constraint  based 
programming  systems  [5]. 

During  the  development  of  the  Sembase  subsystem  of  Cactis  a  couple  lessons  were 
learned.  While  this  project  did  produce  a  system  capable  of  supporting  a  wide  class  of 
object-oriented  systems,  including  some  forms  of  derived  information,  it  fell  short  in  two 
ways.  First  of  all,  only  a  subset  of  first  order  predicate  calculus  expressions  may  be  used 
to  manage  derived  data.  Secondly,  the  code,  while  very  efficient,  is  tricky  and  inelegant. 
Cactis  supports  a  much  wider  class  of  derived  information,  and  does  so  in  a  clean 
fashion,  based  on  a  simple  algorithmic  model. 

In  the  next  Section,  the  Cactis  data  model  is  briefly  described.  Section  4  considers 
how  algorithmically  efficient  incremental  update  can  be  performed,  and  the  implementa¬ 
tion  of  Cactis  is  discussed  in  Section  5.  Section  6  describes  the  Cactis  data  language;  the 
examples  are  taken  from  a  software  environment  application.  Section  7  discusses  perfor¬ 
mance  tests  which  have  been  conducted.  Finally,  Section  8  considers  some  of  the  limita¬ 
tions  of  the  system,  details  the  directions  this  research  is  currendy  raking,  and  provides 
conclusions. 


3.  The  Cactis  Data  Model 


In  this  section  we  will  first  informally  introduce  and  motivate  the  Cactis  data  model, 
and  then  introduce  a  more  formal  description  along  with  an  example. 

informally,  the  data  in  a  Cactis  database  consists  of  a  collection  of  typed  objects. 
Each  object  represents  some  entity  modeled  in  the  database,  and  encapsulates  both  the 
data  and  the  behavior  associated  with  that  entity.  Objects  contain  internal  (hidden)  struc¬ 
ture,  and  can  be  related  to  one  another  externally  by  relationships  to  create  external  struc¬ 
ture.  In  conventional  object-oriented  systems  such  as  Smalltalk  [16],  the  external  inter¬ 
face  to  an  object  is  a  set  of  messages  which  it  can  respond  to.  In  a  Cactis  database  the 
interface  to  an  object  is  the  set  of  values  that  flow  into  and  out  of  the  object  across  rela¬ 
tionships. 

A  Cactis  schema  defines,  for  each  object  type,  an  internal  implementation.  This 
internal  implementation  defines  the  values  stored  within  an  object,  and  the  constraints 
placed  on  these  values.  In  addition,  the  schema  specifies  how  these  values  may  be  func¬ 
tionally  derived  from  values  passed  into  the  object  across  relationships,  and  how  the 
object  may  derive  the  values  to  be  passed  out  of  the  object  across  those  relationships. 
This  functional  derivation  of  data  values  allows  objects  to  respond  to  their  environment. 
When  a  data  value  imported  into  an  object  across  a  relationship  changes  value,  the  inter¬ 
nal  implementation  of  the  object  may  respond  by  recomputing  local  data,  and  by  provid¬ 
ing  new  data  which  is  exported  from  the  object  across  relationships.  This  automatic 
derivation  of  data  implements  the  behavior  of  the  object. 

An  important  distinction  between  Cactis  and  other  high-level  models,  such  as  the 
predominant  semantic  models,  is  that  it  handles  relationships  very  differently.  For  exam- 


pie,  in  the  Entity-Relationship  Model  [10],  the  Semantic  Data  Model  [17],  and  the  Func¬ 
tional  Data  Model  [21,37],  relationships  are  defined  with  types;  an  object  type  definition 
encompasses  the  relationshios  it  participates  in  -  including  the  range  types  of  the  given 
relationships.  This  is  not  true  in  Cactis,  where  relationships  are  typed  separately,  and  the 
range  type  of  a  relationship  does  not  depend  on  the  domain  type. 

A  further  consequence  is  that  a  DBMS  based  on  a  semantic  model  is  not  conducive 
to  restructuring.  In  order  to  vary  the  manner  in  which  two  types  are  connected 

via  a  relationship,  one  must  redefine  two  types.  In  Cactis  this  can  be  done  dynamically, 
at  run-time,  by  merely  assigning  a  new  relationship  to  both  sets  of  connectors. 

To  male*-  these  concepts  and  their  implications  more  concrete,  we  now  introduce  a 
more  formal  definition  of  the  model  and  provide  an  example  in  a  graphical  notation. 

Each  object  in  a  Cactis  database  is  an  instance  of  a  type  from  a  hierarchical  type 
system.  Multiple  inheritance  is  supported  and  resolved  at  schema  definition  time.  The 
type  of  an  object  can  be  either  explicitly  declared,  or  chosen  dynamically  from  a  family 
of  subtypes  using  predicates  which  are  evaluated  whenever  updates  arc  made.  Each  type 
specifies  the  internal  implementation  of  a  class  of  objects  along  with  their  external  inter¬ 
face.  The  internal  implementation  of  an  object  indicates  a  set  of  typed  data  values  called 
attributes  that  are  stored  within  the  object.  Currently,  attributes  may  take  values  from 
any  type  definable  in  the  C  programming  language. 

In  addition  to  attributes,  an  object’s  internal  implementation  specifies  how  some  of 
these  values  may  be  functionally  derived  from  other  values,  either  within  the  same 
object,  or  imported  across  relationships.  This  specification  comes  in  the  form  of  attribute 
evaluation  rules  attached  to  attributes.  An  evaluation  rule  attached  to  an  attribute 


Figure  1.  A  Simplified  Sample  Schema 

indicates  that  the  attribute’s  (observable)  value  should  always  be  equal  to  the  expression 
given  in  the  rule.  Attributes  which  have  an  evaluation  rule  attached  are  called  derived. 
attributes,  whereas  values  which  are  simply  stored  are  called  intrinsic  attributes.  Attri¬ 
bute  evaluation  rules  are  applicative  and  may  not  have  side  effects.  Currently,  attribute 
evaluation  rules  are  expressed  in  a  data  language  which  is  compiled  into  C  and  can  com¬ 
pute  any  function  expressible  in  the  C  language.  Finally,  the  internal  implementation  of 
an  object  may  specify  additional  constraints  on  the  attributes  of  an  object.  The'c  con- 


straints  provide  predicates  thar  are  to  hold  true  after  every  database  update.  These  con¬ 
straints  are  implemented  using  normal  attribute  evaluation  rules  which  return  a  boolean 
value  which  is  the  result  of  evaluating  the  constraint  predicate. 

The  external  interface  to  an  object  describes  the  relationships  it  may  have  with  other 
objects  and  the  way  values  flow  across  those  relationship.  Each  relationship  is  typed 
and  directed.  Type  and  direction  are  used  to  represent  scmannc  concepts.  For  example,  a 
relationship  might  represent  the  semantic  concept  "component-of'.  Because  of  the  need 
for  dircctedness,  relationships  have  two  distinct  ends  which  we  will  call  connectors.  In 
order  to  distinguish  the  ends  of  a  of  a  relationship  and  hence  establish  its  dircctedness, 
we  say  that  each  relationship  possesses  a  black  connector,  and  a  white  connector.  The 
external  interface  for  an  object  consist  of  a  set  of  connectors  which  are  intended  to  match 
the  connectors  of  a  relationship  type. 


Figure  2.  Example  Objects 
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As  an  example.  Figure  1  uses  a  simplified  graphical  notation  (similar  to  [1])  to  show 
three  object  types:  Proj_Cost,  Comp_Cost,  and  Fixed_Cost,  along  with  one  reiadonship 
type:  CST.  (To  simplify  the  graphical  presentation  we  have  used  only  1:1  relationships. 
A  more  realistic  set  of  objects  for  this  task  would  use  l:many  relationships.)  Proj_Cost 
objects  represent  and  compute  the  total  estimated  cost  for  a  software  project  as  derived 
from  other  objects.  A  Comp_Cost  object  automatically  computes  the  estimated  cost  of  a 
part  of  a  project  on  the  basis  of  a  formula  encapsulated  in  the  object,  along  with  the  esti¬ 
mates  for  subparts  it  is  related  to.  Finally,  a  Fixed_Cost  object  represents  an  estimate 
which  has  been  explicitly  entered  by  a  user. 

Note  that,  in  this  example,  objects  of  type  Proj_Cost  possesses  a  black  CST  connec¬ 
tor,  hence  they  may  be  related  to  any  object  which  possesses  a  white  CST  connector,  in 
this  case  objects  of  type  Comp_Cost  or  Fixed_Cost.  Similarly,  objects  of  type 
Comp_Cost  also  possess  a  black  CST  connector,  therefore  they  may  be  related  to  objects 
of  type  Comp_Cost  or  Fixed_Cost.  An  important  point  here  is  that  an  object  of  type 
Proj_Cost  cannot  know,  and  does  not  care,  whether  it  is  related  to  an  object  of  type 
Comp_Cost,  or  type  Fixed_Cost,  or  of  some  type  not  shown  (perhaps  even  a  type  that  did 
not  exist  when  Proj_Cost  was  defined).  An  object  of  type  Proj_Cost  is  only  concerned 
with  the  values  that  flow  across  a  CST  relationship,  not  how  they  are  provided.  This 
allows  a  Cactis  database  to  be  both  flexible  and  extensible  while  retaining  strong  typing. 
In  this  case  both  automatically  computed  and  manually  entered  cost  estimates  can  be 
handled  uniformly,  and  a  number  of  different  cost  estimating  formulas  or  methods  could 
all  be  handled  uniformly  and  transparently. 


To  illustrate  how  actual  objects  of  the  types  shown  in  Figure  1  might  behave.  Figure 
2  provides  a  more  detailed  view  of  some  sample  objects  of  those  types.  Here  we  have 
shown  not  only  relationships,  but  also  internal  attribute  values,  the  values  that  flow  across 
relationships,  and  the  fact  that  some  internal  attributes  are  derived.  In  this  case,  a 
Comp_Cost  object  is  shown  to  have  three  internal  attributes:  two  intrinsic  attributes, 
and,  as  shown  by  the  arrows,  a  derived  attribute  which  computes  its  value  as  a  function 
(not  shown)  of  the  intrinsic  attributes  and  the  values  passed  into  the  object  from  the  out¬ 
side.  Similarly,  objects  of  type  Proj_Cost  have  a  single  derived  attribute,  and  objects  of 
type  Fixed_Cost  have  a  single  intrinsic  attribute. 

To  see  how  a  Cactis  database  handles  updates,  consider  what  would  happen  if  the 
single  attribute  in  the  Fixed_Cost  object  shown  were  changed.  Because  of  the  specific 
data  level  relationships  that  have  been  established,  changing  the  Fixed_Cost  attribute 
indirectly  affects  both  the  derived  attribute  of  the  middle  Comp_Cost  object  and  the 
derived  attribute  of  the  Proj_Cost  object.  When  such  a  change  is  made,  the  Cactis  sys¬ 
tem  is  responsible  for  identifying  and  performing  any  indirect  updates  needed  to  bring 
the  system  up  to  date  with  respect  to  all  attribute  evaluation  rules.  In  this  case,  the  sys¬ 
tem  must  recompute  two  attributes.  Later  we  will  see  that,  if  the  results  are  not  immedi¬ 
ately  needed,  such  computations  are  not  performed  immediately,  but  deferred  until  the 
value  is  actually  required.  The  user  also  has  the  option  of  declaring  that  an  attribute  is 
important.  This  indicates  that  the  system  is  to  maintain  the  correct  value  for  the  attribute 
at  all  times.  This  capability  is  used,  for  example,  to  insure  that  the  attributes  computing 
the  predicate  for  each  constraint  are  always  reevaluated  if  they  could  change. 


In  this  section  we  have  seen  an  outline  of  the  Cacds  data  model.  We  have  seen  that 
data  is  modeled  as  a  set  of  objects  connected  by  relationships.  These  objects  encapsulate 
data  and  behavior  by  providing  evaluation  rules  that  indicated  how  some  data  is  derived 
in  terms  of  other  data.  Looking  at  this  model  in  a  different  light,  it  can  be  seen  as 
equivalent  to  a  form  of  attributed  graph.  That  is,  a  graph  which  has  its  nodes  decorated 
with  attribute  values,  and  for  which  there  is  a  set  of  defining  attribute  evaluation  rules 
which  describe  how  attributes  may  be  derived  from  other  attributes.  Consequently,  this 
model  is  related  to  the  attribute  grammars  used  in  compilers  and  language  dependent  edi¬ 
tors.  This  relationship  will  point  the  way  to  a  new  incremental  update  algorithm  in  the 
next  section. 

4.  Efficient  Incremental  Update 

A  number  of  data  models  have  made  provisions  for  functionally  derived  or  active 
data  which  can  respond  to  changes  in  surrounding  data.  As  a  typical  example,  the 
LOOPS  object-oriented  programming  system  [4]  provides  active  data  which  can  invoke  a 
procedure  whenever  a  data  item  is  accessed.  The  implementations  of  active  values  in 
LOOPS,  as  well  as  most  other  systems  which  provide  active  or  derived  data  use  tech¬ 
niques  equivalent  to  triggers  [6]  attached  to  data.  While  this  method  is  adequate  for 
sparsely  interconnected  data,  it  can  present  problems  for  more  highly  interconnected 
data.  Since  there  is  no  restriction  on  the  kinds  of  actions  performed  by  triggers,  the  order 
of  their  firing  can  change  their  overall  effect.  While  this  allows  triggers  to  be  extremely 
flexible,  it  can  also  become  very  difficult  to  keep  track  of  the  interrelationships  between 
triggers.  Hence,  it  is  easy  for  errors  involving  unforeseen  interrelationships  to  occur,  and 
much  more  difficult  to  predict  the  behavior  of  the  system  under  unexpected  cir- 
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cumstanccs. 

By  contrast,  the  effects  of  attribute  evaluation  computations  used  in  the  Cactis  sys¬ 
tem  are  much  easier  to  isolate  and  understand.  Each  data  type  in  the  system  can  be  under¬ 
stood  in  terms  of  the  the  values  it  transmits  out  across  reladonships,  the  values  it  expects 
to  receive  across  relationships,  and  the  local  attributes  of  the  object.  This  allows  the 
schema  to  be  designed  in  a  structured  fashion  and  brings  with  it  many  of  the  advantages 
of  modem  structured  programming  techniques. 

Even  if  we  can  adequately  deal  with  the  unconstrained  and  unstructured  nature  of 
triggers,  they  can  also  be  highly  inefficient.  Figure  3  shows  the  interrelationships 
between  several  pieces  of  data.  The  arcs  in  the  graph  represent  the  fact  that  a  change  in 
one  piece  of  data  invokes  a  trigger  which  modifies  another  piece  of  data.  For  example, 
modifying  the  data  marked  A  affects  the  data  items  marked  B  and  C  If  we  choose  a 
naive  ordering  for  recomputing  data  values  after  a  change,  we  may  waste  a  great  deal  of 
work  by  computing  the  same  data  values  several  times.  For  example,  a  simple  trigger 
mechanism  might  work  recursively,  invoking  new  triggers  as  soon  as  data  changes. 


Figure  3.  A  System  of  Dependencies 


However,  in  our  example,  this  simple  scheme  would  result  in  recomputing  data  value  I 
eight  times,  once  for  each  path  from  the  original  change  to  the  data  item.  In  fact,  only  a 
few  of  the  many  possible  orderings  of  computations  does  not  needlessly  recompute  some 
data  values.  In  this  case,  a  breadth  first  order  is  optimal,  however,  it  is  easy  to  construct 
graphs  where  this  is  not  true.  Any  trigger  mechanism  which  uses  a  fixed  ordering  of 
some  sort  (e.g  depth  first  or  breadth  first)  will  needlessly  recompute  some  values  for 
many  graphs  found  in  practice.  In  particular,  any  graph  which  contains  dependencies 
which  are  not  limited  to  trees  or  linear  chains  can  cause  this  behavior.  In  the  wont  case 
triggers  can  needlessly  recompute  an  exponential  number  of  values.  For  example,  if  a 
depth  first  order  is  used  on  an  extended  graph  of  the  type  shown  in  Figure  3,  0(2N) 
wasted  computations  will  be  performed.  On  the  other  hand,  the  attribute  evaluation  tech¬ 
nique  used  in  the  Cacris  system  will  not  evaluate  any  attribute  that  is  not  actually  needed, 
and  will  not  evaluate  any  given  attribute  more  than  once. 

Cactis  supports  a  number  of  primitives  which  the  data  language  described  in  Sec¬ 
tion  6  has  been  built  on  top  of.  These  primitives  include  operations  for  creating  and 
deleting  objects,  creating  and  deleting  relationships  between  objects,  defining  predicates 
and  subtypes,  and  primitives  for  retrieving  and  replacing  attribute  values. 

Whenever  these  primitives  are  used  to  change  a  database,  Cactis  must  ensure  that 
all  attribute  values  in  the  database  retain  a  value  which  is  consistent  with  the  attribute 
rules  of  the  system.  This  requires  some  sort  of  attribute  evaluation  strategy  or  algorithm. 
One  approach  would  be  to  recompute  all  attribute  values  every  time  a  change  is  made  to 
any  part  of  the  system.  This  is  clearly  too  expensive.  What  is  needed  is  an  algorithm  for 
incremental  attribute  evaluation,  which  computes  only  those  attributes  whose  values 


change  as  a  result  of  a  given  database  modification.  This  problem  also  arises  in  the  area 
of  syntax  directed  editing  systems,  so  it  is  not  surprising  that  algorithms  exist  to  solve 
this  problem  for  the  attribute  grammars  used  in  that  application.  The  most  successful  of 
these  algorithms  is  due  to  Reps  [34].  Reps’  algorithm  is  optimal  in  the  sense  that  only 
attributes  whose  values  actually  change  are  recomputed. 

Unfortunately,  Reps’  algorithm,  while  optimal  for  attributed  trees,  does  not  extend 
directly  to  the  arbitrary  graphs  used  by  Cactis.  Instead,  a  new  incremental  attribute 
evaluation  algorithm  has  been  designed  for  Cactis.  This  new  algorithm  exhibits  perfor¬ 
mance  which  is  similar  to  Reps’  algorithm,  but  does  have  an  inferior  worst  case  upper 
bound  on  the  amount  of  overhead  incurred. 

The  algorithm  works  by  using  a  strategy  which  first  determines  what  work  has  to  be 
done,  then  performs  the  actual  computations.  The  algorithm  uses  the  dependencies 
between  attributes.  An  attribute  is  dependent  on  another  attribute  if  that  attribute  is  men¬ 
tioned  in  its  attribute  evaluation  rule  (i.e.  is  needed  to  compute  the  derived  value  of  that 
attribute).  When  the  value  of  an  intrinsic  attribute  is  changed,  it  may  cause  the  attributes 
which  depend  on  it  to  become  out  of  date  with  respect  to  their  defining  attribute  evalua¬ 
tion  rules.  Instead  of  immediately  recomputing  these  values,  we  simply  mark  them  as 
out-of-date.  We  then  find  all  attributes  which  are  dependent  on  these  newly  out-of-date 
attributes,  and  mark  them  out-of-date  as  well. 

This  process  continues  until  we  have  marked  ail  affected  attributes.  During  this 
process  of  marking,  we  determine  if  each  marked  attribute  is  important.  Designating  an 
attribute  important  indicates  that  the  system  is  to  maintain  its  correct  value  at  all  times. 
Important  attributes  include  those  explicitly  designated  as  important  by  the  user,  and 
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those  that  compute  constraint  predicates.  When  we  have  completed  marking  attributes 
during  the  first  phase  of  the  algorithm,  we  will  have  obtained  a  list  of  attributes  which  are 
both  out-of-date  and  important  We  then  use  a  demand  driven  algorithm  to  evaluate 
these  attributes  in  a  simple  recursive  manner.  The  calculation  of  attribute  values  which 
are  not  important  may  be  deferred,  as  they  have  no  immediate  affect  on  the  database.  If 
the  user  explicitly  requests  the  value  of  attributes  (i.e.  makes  a  query)  new  computations 
of  out-of-date  attributes  may  be  invoked  in  order  to  obtain  correct  values.  A  similar 
implementation  approach  using  lazy  evaluation  is  described  in  [7]. 

We  will  now  consider  the  efficiency  of  the  attribute  evaluation  algorithm.  We  will 
do  this  analysis  on  the  basis  of  changing  a  single  attribute  value,  however  the  results 
extend  to  multiple  attribute  changes  as  well  as  to  changes  in  the  structure  of  the  data 
graph. 

To  begin  we  will  define  several  terms.  The  relationship  of  "depends  on"  defines  a 
directed  graph  with  attributes  labeling  nodes.  This  graph,  which  we  will  call 
depen ds_on,  will  have  an  edge  from  the  node  labeled  with  A  to  the  node  labeled  with  B 
if  and  only  if  attribute  A  depends  on  attribute  B.  We  define  the  graph  Inverse(g)  as  sim¬ 
ply  the  graph  g  with  all  the  edges  reversed.  We  define  a  graph  Reachable(n,g)  as  the 
subgraph  of  graph  g  which  is  connected  to  node  n  (that  is  which  is  reachable  by  follow¬ 
ing  edges  from  node  n.)  Finally  we  define  the  set  Nodes(g)  to  be  ail  the  nodes  of  a  graph 
g,  and  the  set  Edges(g)  to  be  ail  the  edges  of  a  graph  g. 

For  a  given  attribute  A  we  will  define  a  graph  Could_Change(A)  which  describes 
all  the  attributes  that  might  change  value  if  a  new  value  is  assigned  to  the  attribute  A. 
This  graph  contains  the  set  of  attributes  that  depend  on  A  either  directly  or  indirectly. 


Specifically: 


Could_Change(A)  =  ReachabIe(A,  Inverse(depends_on)) 

In  other  words  the  subgraph  reachable  by  following  the  depends  on  relationship  back¬ 
wards  from  A. 

In  addition  to  the  Could_Change  graph  we  can  also  define  a  set  Change(A,V).  This 
set  will  contain  all  attributes  which  must  be  reevaluated  in  order  to  insure  that  all  affected 
attributes'  have  a  correct  value.  More  precisely,  the  set  Change(A,V)  contains  all  attri¬ 
butes  that  either  1)  require  a  new  value  after  changing  A  to  V,  or  2)  are  directly  derived 
from  some  attribute  that  requires  a  new  value.  Note  that  Change(A,V)  c 
Nodes(Could_Change(A))  for  all  attributes  A  and  values  V,  and  that  Change(A,V)  =  ( } 
when  the  current  value  of  attribute  A  is  already  V. 

When  the  value  V  is  assigned  to  an  attribute  A  in  the  Cactis  attribute  evaluation 
algorithm,  no  attributes  other  than  those  in  the  set  Change(A.V)  are  reevaluated  when  we 
amortize  over  any  transaction  sequence1.  This  is  exactly  the  same  set  that  would  have 
been  reevaluated  by  Reps’  optimal  evaluation  algorithm. 

Reps’  algorithm  is  also  optimal  in  the  total  overhead  incurred  in  attribute  evalua¬ 
tion.  Its  total  overhead  in  both  best  and  worst  cases  is  limited  to  0(  |  Change(A,V)|  ). 
However,  Reps’  algorithm  uses  some  special  properties  of  trees  to  achieve  this  result. 
These  properties  do  not  apply  to  the  more  general  graphs  used  by  the  Cactis  system 

The  overhead  of  the  Cactis  algorithm  is  not  optimal.  In  particular  its  worst  case 
amortized  overhead  is: 

K  tingle  change  may  recompile  attributes  ouuide  the  ChangefA.V)  sot,  however,  the  evaluation  of  all  such  a  an  butes  must 
have  been  deferred  from  the  Change  set  of  some  previous  transaction.  Hence  the  evaluation  of  these  attributes  can  be  charged  to  the 
unanund  coat  function  of  that  previous  transaction. 


0(J  Nodes(Could_Change(A))|  -t-|  Edges(Could_Change(A))j ) 

This  behavior  comes  from  the  mark  out-of-date  phase  of  the  algorithm  which  does  a 
depth  first  traversal  starting  from  the  node  A.  In  the  wont  case  this  traversal  may  visit 
the  entire  Could  ^Change  graph  for  A  hence  visiting  each  node  and  edge  in  this  graph. 
However,  this  is  the  wont  case  behavior.  In  many  real  cases  this  travenal  will  be  cut 
short  by  finding  attributes  which  are  already  out-of-date.  For  example  if  an  attribute  A 
were  assigned  two  different  values  in  a  row  before  updating  the  system,  the  second 
assignment  would  only  update  A  and  not  visit  any  other  attributes  and  hence  incur  only 
0(1)  overhead.  In  general  the  actual  performance  of  the  Cacds  attribute  evaluation  algo¬ 
rithm  will  depend  on  the  attributes  involved,  particularly  on  whether  some  attributes  may 
remain  as  out-of-date  for  long  periods  of  time  if  they  arc  not  important  and  arc  not 
accessed.  Also,  if  a  given  attribute  is  changed  as  a  result  of  two  different  primitive 
updates  to  intrinsic  attributes,  the  given  attribute  will  only  be  reevaluated  once  (unless  of 
course,  the  given  attribute  has  been  accessed  or  used  to  compute  a  constraint  predicate 
before  the  second  primitive  update  is  performed). 

In  order  to  support  the  primitives  which  break  and  establish  relationships,  a  process 
similar  to  that  used  for  intrinsic  attribute  changes  is  used.  When  a  relationship  is  broken, 
the  system  determines  which  derived  attributes  depend  on  values  that  arc  passed  across 
the  relationship.  These  attributes  are  marked  out-of-date  just  as  if  an  intrinsic  attribute 
had  changed.  When  a  relationship  is  established,  the  second  half  of  the  attribute  evalua¬ 
tion  algorithm  is  invoked  to  evaluate  attributes  which  are  out-of-date  and  important.  In 
order  to  ensure  that  derived  attributes  can  always  be  given  a  valid  value,  the  database 
ensures  that  connectors  are  not  left  dangling  without  a  matching  relationship.  If  relation¬ 
ships  for  each  connector  of  an  object  are  not  explicitly  provided  by  the  transaction,  the 


system  will  automatically  provide  default  values  to  replace  any  values  what  would  nor¬ 
mally  flow  into  the  object  along  the  missing  relationships.  As  a  final  note,  please  notice 
that  the  primitive  to  delete  an  object  can  be  Heated  the  same  as  breaking  all  relationships 
to  the  object,  and  the  primitive  to  create  an  object  does  not  affect  attribute  evaluation 
until  relationships  are  established. 

During  the  evaluation  of  attributes,  certain  attributes  will  compute  constraint  predi¬ 
cates.  After  such  an  attribute  is  evaluated,  its  value  is  tested.  If  it  evaluates  false,  a  con¬ 
straint  violation  exists.  Under  user  control,  this  can  either  causes  the  transaction  invok¬ 
ing  the  evaluation  to  fail  and  be  rolled  back  or,  optionally,  a  special  recovery  action  asso¬ 
ciated  with  the  constraint  can  be  invoked  to  attempt  to  recover  from  the  violation.  In 
either  case,  the  constraint  must  be  satisfied  or  the  transaction  invoking  the  evaluation  will 
fail  and  be  rolled  back. 

5.  The  Storage  Structures  and  Access  Methods  of  Cactis 

We  begin  our  discussion  of  the  implementation  of  Cactis  by  describing  a  straight 
forward  implementation  of  the  model  which  does  not  attempt  to  optimize  disk  access. 
We  will  then  show  how  Cactis  uses  a  self-adaptive  and  concurrent  approach  to  create  an 
optimized'  implementation  of  the  model. 

In  this  section  we  concentrate  on  the  implementation  of  the  attribute  evaluation  por¬ 
tion  of  the  Cactis  model.  This  repre.'.ents  the  vast  majority  of  the  code  within  Cactis.  In 
the  last  section  we  gave  an  informal  presentation  of  the  incremental  attribute  evaluation 
algorithm  used  by  the  Cactis  data  model.  In  this  section  we  give  a  more  concrete  and 
specific  description. 


The  algorithm  uses  several  pieces  of  information  for  each  attribute  in  the  data, 
including: 

value  -  The  actual  value  assigned  to  the  attribute. 

outofdate  -  A  boolean  value  which  indicates  whether  the  attribute  has  been  marked 

out-of-date. 

changetime  -  An  integer  timestamp  that  indicates  when  the  value  was  last  assigned  a 
new  value. 

readtime  -  An  integer  timestamp  that  indicates  when  the  value  was  last  used  for  a 
computation  or  read  by  the  user. 

In  addition,  attribute  information  that  is  the  same  for  all  data  objects  of  one  object  type  is 
stored  in  the  schema.  This  information  includes: 

important  -  A  boolean  value  which  indicates  if  the  attribute  is  designated  important. 

evalproc  -  A  procedure  which  encodes  the  attribute  evaluation  rule  for  the  attri¬ 
bute. 

dependson  -  A  set  of  things  that  this  attribute  depends  on. 

The  value  of  the  attribute  is  simpiy  whatever  value  the  attribute  currently  has.  This 
value  may  or  may  not  be  correct  depending  on  whether  the  attribute  is  up  to  date.  The 
outofdate  flag  indicates  whether  the  attribute  might  be  out  of  date  with  respect  to  its 
defining  attribute  evaluation  rule.  When  outofdate  is  false  the  attribute  will  have  a 
correct  value.  When  outofdate  is  true  the  attribute  may  or  may  not  have  a  correct  value. 
It  is  important  to  note  that  the  database  need  not  know  anything  about  the  internal  struc¬ 
ture  of  the  attribute  value  other  than  its  total  size.  Knowledge  about  ail  other  aspects  of 


the  attribute  is  encapsulated  in  the  evalproc  for  the  attribute.  This  encapsulation  greatly 
simplifies  the  construction  of  the  attribute  evaluation  system. 

The  important  flag  indicates  if  the  attribute  is  to  be  considered  important.  If  an 
attribute  is  considered  important,  the  system  will  ensure  that  it  always  has  an  up  to  date 
value.  In  general,  the  important  flag  for  an  attribute  within  an  object  of  a  given  type 
does  not  change,  hence  it  is  stored  in  the  schema.  Occasionally  the  importance  of  an  attri¬ 
bute  should  change  over  time  and  depend  on  a  predicate.  In  this  case  we  can  introduce  a 
new  attribute  which  is  designated  important.  The  evaluation  rule  for  this  new  attribute 
will  evaluate  the  predicate  which  determines  if  the  original  attribute  is  to  be  considered 
important  The  evaluation  rule  will  then  request  the  correct  value  of  the  original  attribute 
if  and  only  if  it  is  to  be  considered  important  (i.e.,  the  predicate  evaluates  true). 

The  changetime  associated  with  an  attribute  gives  the  timestamp  for  the  last  time 
the  attribute  value  was  changed.  This  changetime  information  can  be  used  to  avoid 
unnecessary  attribute  evaluations.  If  the  changetime  of  an  attribute  being  evaluated  is 
later  than  the  changetimes  of  all  of  the  attributes  it  depends  on,  it  need  not  be 
reevaluated  nor  its  changetime  modified.  If  an  attribute  is  reevaluated  but  does  not  actu¬ 
ally  change  value,  its  changetime  can  also  be  left  undisturbed.  This  allows  us  to  avoid 
chains  of  attribute  evaluations  which  do  not  result  in  any  value  changes  (although  we 
must  still  incur  overhead  for  these  attributes).  The  changetime  and  readtime  of  an  attri¬ 
bute  are  used  in  the  concurrency  control  algorithm  discussed  in  Section  5.4. 

The  evalproc  of  the  attribute  is  a  procedure  which  encodes  the  attribute’s  evalua¬ 
tion  rule.  This  procedure  will  request  other  attribute  values  that  it  needs  to  perform  its 
evaluation  and  will  compute  a  new  value  and  changetime  for  the  attribute  when  needed. 


This  procedure  is  provided  by  the  data  definition  language  compiler  on  the  basis  of  the 
attribute  evaluation  rule  given  by  the  user.  Since  the  evalproc  of  an  attribute  of  an  object 
of  a  given  type  does  not  change,  it  is  stored  in  the  schema.  The  form  of  the  evalproc  is  a 
compiled  C  function.  This  currently  limits  the  extent  to  which  dynamic  schema  changes 
can  be  mart**  without  relinking.  An  improvement  which  has  not  yet  been  implemented 
would  be  to  represent  an  evalproc  as  a  list  of  machine  independent  P -codes  like  those 
used  to  implement  some  machine  dependent  compilers  [32],  In  the  later  case,  these  P- 
codes  would  be  interpreted  rather  than  executed  directly.  This  will  allow  new  attributes 
and  evaluation  routines  to  be  added  while  the  database  is  running. 

Finally,  the  dependson  set  associated  with  an  attribute  encodes  those  things  (if  any) 
that  the  attribute  will  need  to  compute  its  value.  Again,  since  the  dependson  set  of  an 
attribute  within  an  object  of  a  particular  type  does  not  change,  this  information  is  stored 
in  the  schema.  Further  note  that  the  dependson  sets  only  encode  the  portion  of  the 
overall  dependency  graph  that  is  local  to  a  given  object  type.  Traversal  of  the  actual  glo¬ 
bal  dependency  graph  is  performed  by  dynamically  linking  together  the  appropriate  local 
dependency  subgraphs  found  in  the  schema.  The  global  dependency  graph  is  never 
explicitly  constructed  or  stored. 

As  detailed  above,  a  Cactis  database  must  store  "overhead"  information  with  each 
attribute  of  each  data  object.  However,  since  the  system  is  designed  to  support  applica¬ 
tions  with  large  complex  objects,  the  space  for  this  information  may  not  prove  to  be 
significant  in  practice.  In  particular,  if  32  bit  time  stamps  are  used,  this  information  only 
amounts  to  65  bits  of  storage  beyond  that  used  by  the  attribute  value  itself. 


5.1.  A  Naive  Evaluation  Algorithm 

We  will  now  discuss  a  simple  version  of  the  Cactis  attribute  evaluation  algorithm 
which  is  efficient  in  the  number  of  attributes  which  are  evaluated,  but  which  is  naive  with 
respect  to  I/O  cost.  In  the  next  section  we  will  consider  an  improved  algorithm  which 
attempts  to  optimize  the  amount  of  I/O  performed.  Recall  that  the  attribute  evaluation 
algorithm  has  two  phases.  An  initial  "mark  out-of-date"  phase,  and  a  second  phase  which 
reevaluates  attributes  as  needed.  The  first  of  these  phases  is  initiated  when  attributes  are 
assigned  new  values.  The  routines  Set_Attr_Value  and  Mark_Out_Of_Date  given  in  Fig¬ 
ure  4  show  how  this  phase  is  implemented. 

Each  time  an  attribute  is  assigned  a  new  value,  all  attributes  that  depend  on  that 
attribute  dirccdy  or  indirectly  are  marked  as  out-of-date  using  the  Mark_Out_Of_Date 
routine,  and  the  changetime  for  the  modified  attribute  is  set  to  the  current  virtual  time. 
An  attribute  is  said  to  depend  on  another  attribute  if  it  might  be  needed  to  evaluate  its 
value  (in  other  words  if  it  is  mentioned  in  its  attribute  evaluation  rule.)  For  example  if 
the  attribute  A  uses  the  following  attribute  evaluation  rule  from  the  schema: 

A  :=  (B  +  C)  *  (D  -  2); 

then  the  attribute  A  will  depend  on  attributes  B,  C,  and  D. 

The  Mark_Out_Of_Dare  routine  is  a  simple  depth  first-  marking  procedure.  The 
only  special  thing  that  it  does  is  to  record  on  an  evaluation  list  any  attribute  which  is 
designated  important.  Attributes  on  this  list  will  be  those  that  are  currently  both  impor¬ 
tant  and  out-of-date.  These  are  the  attributes  which  the  system  will  be  concerned  with  in 
the  second  phase  of  attribute  evaluation. 


Procedure  Set_Attr_Vaiue( 

InOut  A :  Attribute; 

In  V  :  Value; 

InOut  Eval_List :  List  of  Attribute) 

Begin 

Mark_Out_Of_Datet.A,  Eval_List); 

A. value  :=  V; 

A.outofdate  :=  FALSE; 

A.changetime  :=  NOW; 

End; 

Procedure  Mark_Out_OfJDate( 

InOut  A :  Attribute; 

InOut  Eval_List :  List  of  Attributes) 

Begin 

If  Not  A.outofdate  Then  Begin 
If  A.important  Then 

EvalJ-ist  :=  Eval_List  II  A; 

A.outofdate  :=  TRUE; 

For  Each  B  such  that  B  depends  on  A  Do 
Mark_Out_Of_Date(B  ,Eval_List) ; 

End; 

End; 


Rgure  4.  "Naive"  Attribute  Evaluation  Routines 
As  shown  here,  the  Mark_Our_Of_Date  routine  makes  no  attempt  to  minimize  disk 
access  costs,  and  always  uses  a  fixed  depth  first  order  of  traversal.  In  the  next  section  we 
will  consider  how  we  can  dynamically  choose  a  traversal  orderwhich  is  most  likely  to  be 
efficient  in  terms  of  disk  access. 

The  second  phase  of  attribute  evaluation  algorithm  is  invoked  after  a  scries  of  attri¬ 
bute  changes  using  the  Update_System  routine  given  in  Figure  5.  This  routine  simply 
evaluates  each  of  the  attributes  on  the  evaluation  list  that  was  collected  in  the  mark  out- 
of-date  phase  of  attribute  evaluation.  Each  attribute  is  evaluated  by  the  Eval_Attr  routine 


Procedure  Eval_Attr(InOut  A  :  Attribute) :  <Value,  TimeS  tamp> 
Begin 

If  A.outofdate  Then 
A.evalproc(A); 

Retum(<A.  value,  A.changetime>); 

End; 

Procedure  Update_System(InOut'Eval_List :  List  of  Attribute) 
Local  Var  Dummy  :  <Value,Timestamp>; 

Begin 

For  Each  A  on  Eval_List  Do 
Dummy  :=  Eval_Attr(A); 

Eval_List  :=  Empty; 

End; 


Figure  5.  "Naive"  Attribute  Evaluation  Routines 
shown  in  Figure  5.  This  routine  uses  the  evaiproc  attached  to  the  attribute  to  reevaluate  it 
if  it  is  marked  out-of-date,  then  returns  the  value  and  a  timestamp  indicating  when  the 
value  was  last  changed. 

The  evaiproc  attached  to  each  attribute  encodes  the  attribute  evaluation  rule  for  the 
attribute  (if  any).  These  routines  arc  slightly  more  complex  than  the  attribute  evaluation 
rule  expressions  they  arc  derived  from  because  they  examine  the  change  time  from  all 
the  values  they  request  in  order  to  determine  if  the  attribute  computation  can  be  avoided. 
However,  they  follow  a  very  rigid  pattern  and  are  easily  created  by  the  data  definition 
language  compiler  based  on  the  attribute  evaluation  rules  given  by  the  user. 


In  addition  to  simple  expressions,  the  attribute  evaluation  algorithm  can  handle  con¬ 
ditional  expressions.  For  example,  we  can  have  an  attribute  E  whose  attribute  evaluation 
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rule  is: 


E  :=  If  B  Then  F  Else  G; 


This  rule  is  implemented  using  an  evalproc  which  is  "lazy".  That  is  a  routine  which  only 
requests  values  if  they  are  actually  needed.  For  example,  when  the  attribute  B  happens  to 
be  true  we  request  the  value  of  attribute  F  but  not  attribute  G.  This  laziness  property 
allows  out-of-date  values  to  remain  in  the  database  until  they  are  actually  needed.  This 
can  represent  a  significant  savings. 

It  is  interesting  to  note  that  because  of  the  clean  formalism  of  attributed  graphs  that 
we  have  used,  we- can  express  each  of  the  four  central  routines  used  for  automatic  update 
of  values  in  under  15  lines  of  pseudo  code. 

In  the  next  section  we  consider  how  to  optimize  update  with  respect  to  disk  access. 
Again  we  will  find  that  the  formalism  used  allows  the  optimized  update  algorithms  to 
remain  fairly  simple. 


5.2.  Optimized  Update 

The  algorithms  we  have  outlined  above  are  efficient  in  terms  of  the  number  of  attri¬ 
butes  that  they  recompute  when  changes  are  made.  However,  they  are  not  necessarily 
efficient  in  terms  of  the  number  of  disk  accesses  needed.  In  this  section  we  look  at  optim¬ 
izations  that  Cactis  uses  to  improve  the  number  of  disk  accesses  performed. 

If  we  examine  the  Mark_Out_Of_Date  and  Eval_Attr  routines  which  are  central  to 
the  evaluation  algorithm,  we  see  that  they  are  each  just  a  traversal  of  part  of  the  attribute 
dependency  graph.  In  the  naive  algorithm,  these  traversals  are  implemented  using 
straightforward  recursive  procedure  calls,  and  hence  represent  a  depth  first  ordering.  In 


the  actual  Cactis  implementation,  these  traversals  are  performed  explicitly  by  the  system 
in  an  order  determined  at  run-dme  and  can  therefore  be  optimized.  Because  all  attribute 
evaluation  rules  are  applicative  in  nature,  we  may  in  fact  choose  any  traversal  order 
which  visits  the  same  attributes.  In  particular,  we  are  free  to  choose  an  order  which 
reduces  the  number  of  disk  accesses  required. 

In  Cactis,  we  use  an  order  of  traversal  which  is  chosen  dynamically.  The  way  we 
choose  this  order  is  to  use  a  concurrent  system  in  which  a  number  of  sub-traversals  are 
(conceptually)  running  at  the  same  rime.  Each  rime  we  teach  a  node  which  has  two  or 
more  descendents  to  traverse,  we  fork  a  sub-traversal  process  to  traverse  the  graph  in 
each  direction.  For  example,  when  we  mark  an  attribute  out-of-date,  we  then  schedule  a 
traversal  process  for  each  of  the  attributes  which  depend  on  it.  When  we  evaluate  an 
attribute,  we  request  all  the  values  needed  to  recompute  its  value  in  parallel.  We  can 
think  of  this  as  a  parallel  traversal  of  the  graph  where  each  branch  of  the  traversal  runs 
independently.  To  optimize  disk  access  we  use  a  greedy  technique.  Of  all  the  sub¬ 
traversal  processes  which  are  runnable  at  any  given  time  we  will  choose  to  execute  the 
one  which  we  exoect  to  perform  the  least  number  of  disk  accesses. 

In  practice  we  will  not  create  actual  separate  processes  to  accomplish  our  parallel 
traversal  but  instead  simulate  multiple  processes  in  a  single  process.  We  break  all  com¬ 
putations  into  pieces  or  chunks  to  be  scheduled  independently.  A  chunk  is  a  small  piece 
of  code  that  runs  to  completion  and  performs  one  small  task.  For  example,  a  normal 
attribute  evaluation  rule  is  implemented  using  two  chunks.  The  first  schedules  an  evalua¬ 
tion  for  each  of  the  other  attribute  values  it  depends  on,  then  makes  arrangements  to 
schedule  the  second  chunk  when  ail  the  values  are  available.  The  second  chunk,  which  is 
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scheduled  only  after  all  the  values  it  needs  have  been  computed,  executes  the  attribute 
evaluation  rule  in  order  to  compute  the  final  value  for  the  attribute.  It  then  stores  the 
value  and  informs  any  process  waiting  for  the  value  that  it  is  now  available. 

Processes  in  the  system  are  each  represented  by  what  we  call  a  pending  record.  A 
pending  record  is  simply  a  data  structure  representing  a  pending  computation.  It  contains 
bookkeeping  information  such  as  the  name  of  the  attribute  involved,  the  number  of 
values  being  waited  for,  storage  for  the  values,  an  optional  pointer  to  a  list  of  parent 
pending  records  to  be  informed  upon  completion,  and  a  pointer  to  a  chunk  routine  to  exe¬ 
cute  when  all  values  are  available.  All  chunk  routines  are  constructed  by  the  data 
language  compiler  discussed  in  section  6.  The  system  works  by  removing  a  pending 
record  from  a  priority  queue  of  all  currently  runnable  processes,  and  simply  calling  the 
chunk  routine  found  in  the  pending  record.  This  routine  performs  an  appropriate  action 
and  terminates,  at  which  time  the  system  chooses  a  new  process  to  run.  The  chunk  rou¬ 
tine  for  a  process  can  perform  actions  such  as  computing  a  new  attribute  value  or 
scheduling  one  or  more  new  pending  records.  In  addition,  the  chunk  routine  can,  if 
needed,  inform  parent  processes  of  completion.  Informing  a  parent  involves  decrement¬ 
ing  the  number  of  values  that  the  parent  is  waiting  for  (stored  in  the  pending  record  for 
the  parent)  and,  if  this  number  falls  to  zero,  moving  the  parent’s  pending  record  to  the 
runnable  queue. 

The  scheme  for  implementing  simulated  concurrency  that  we  have  described  is  very 
simple,  easy  to  implement,  and  has  proven  quite  efficient.  The  technique  we  use  is  simi¬ 
lar  to  that  used  in  the  OWL  real-time  concurrent  programming  language.  For  additional 
information  about  implementation  details,  expected  performance,  translation  of  programs 
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into  chunks  and  experience  with  the  OWL  language  see  [14]. 


Once  concurrency  has  been  introduced,  the  process  of  choosing  a  good  traversal 
order  simplifies  to  a  scheduling  problem.  We  choose  a  process  to  run  which  we  expect  to 
perform  the  least  disk  accesses.  The  obvious  choice  for  this  process  is  one  which  can  be 
processed  using  attributes  currently  in  memory.  Note  that  each  process  is  associated  with 
one  attribute,  the  one  it  is  computing  or  marking  out-of-date.  It  may  need  other  attribute 
values  to  compute  its  own  value,  but  these  are  the  responsibility  of  other  processes.  Any 
needed  values  will  have  been  collected  in  storage  attached  to  the  process’  pending  record 
before  it  is  scheduled  as  runnable. 

We  use  a  simple  hashing  scheme  to  index  all  pending  records  by  the  objects  that 
contain  the  attribute  that  they  are  associated  with.  Whenever  a  disk  block  is  read  into 
memory,  all  processes  which  are  associated  with  some  object  stored  on  that  block  are 
promoted  to  a  special  very  high  priority  queue.  When  new  pending  requests  are 
scheduled,  we  first  check  to  see  if  the  object  associated  with  the  request  is  already  in 


Repeat 

Choose  the  most  referenced  object  in  the  database  that  has  not  yet  been  assigned  a  block. 
Place  this  object  in  a  new  block. 

Repeat 

Choose  the  relationship  belonging  to  some  object  assigned  to  the  block  such  that: 

(1)  The  relationship  is  connected  to  an  unassigned  object  outside  the  block  and, 

(2)  The  total  usage  count  for  the  relationship  is  the  highest. 

Assign  the  object  attached  to  this  relationship  to  the  block. 

Until  the  block  is  full. 

Until  all  objects  are  assigned  blocks. 


Figure  6.  Clustering  Algorithm 
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memory,  if  so  we  schedule  the  request  on  the  high  priority  queue.  Since  they  can  be  exe¬ 
cuted  without  additional  disk  access,  processes  on  the  high  priority  queue  always  have 
priority  over  other  processes. 

In  order  to  improve  the  locality  of  data  references,  we  cluster  data  in  the  Cactis 
model  on  the  basis  of  usage  patterns.  We  keep  a  count  of  the  total  number  of  times  each 
object  in  the  database  is  accessed,  as  well  as  the  number  of  times  we  cross  a  relationship 
between  objects  in  the  process  of  attribute  evaluation  or  marking  out-of-date.  We  then 
periodically  reorganize  the  database  on  the  basis  of  this  information.  In  particular  we 
pack  the  database  into  blocks  using  the  greedy  algorithm  shown  in  Figure  6.  This  algo¬ 
rithm  attempts  to  place  objects  which  are  frequently  referenced  together,  in  the  same 
block.  This  tightens  the  locality  of  reference  for  the  database  and  results  in  increased 
performance  in  databases  where  query  streams  exhibit  commonalities  over  time.  In  fact, 
performance  tests  discussed  in  Section  7  indicate  that  proper  clustering  can  result  in  sav¬ 
ings  of  up  to  60%.  In  addition  to  the  clustering  we  have  described,  offline  reorganization 
includes  housekeeping  tasks  such  as  garbage  detection  and  collection  and  the  generation 
of  statistics  we  use  for  scheduling,  as  described  below. 

Once  all  in  memory  processes  have  been  executed  we  must  choose  a  process  to  exe¬ 
cute  which  will  cause  a  disk  access.  We  would  like  to  choose  the  process  which  will 
cause  the  least  disk  accesses,  however,  we  cannot  know  in  advance  which  process  this 
will  be.  What  we  will  do  instead,  is  use  past  behavior,  or  in  the  case  of  marking  out-of- 
date  a  worst  case  estimate,  as  a  predictor  of  future  behavior.  We  keep  information  about 
past  behavior  in  the  form  of  a  decaying  average  which  changes  over  time.  This  makes  the 
database  self-adaptive,  allowing  changes  in  the  structure  of  the  database  to  be  reflected  in 


changing  averages  and  hence  changing  scheduling  priorities. 

In  the  Cactis  data  model,  values  flow  across  relationships  in  order  to  communicate 
information  from  one  object  to  another.  In  order  to  provide  statistics  for  self-adaptive 
optimization  of  the  attribute  evaluation  process,  we  tag  each  relationship  with  a  series  of 
decaying  averages.  These  statistics  represent  the  average  number  of  objects  visited  (or 
alternately  the  actual  amount  of  disk  I/O  incurred)  when  each  value  transmitted  across 
the  relationship  was  requested  in  the  past.  We  use  these  tags  to  assign  a  priority  to  pend¬ 
ing  records  which  are  requesting  values  from  across  a  relationship.  The  highest  priority  is 
given  to  the  process  with  lowest  expected  disk  I/O.  Processes  which  request  values  local 
to  an  object  rather  than  across  a  relationship  are  not  of  concern  since  they  will  be 
scheduled  as  high  priority  when  the  object  is  brought  into  memory.  A  special  priority  is 
given  to  processes  which  are  the  direct  user  requests  that  start  a  chain  of  computations. 

In  the  case  of  the  traversal  which  performs  evaluation  of  an  attribute,  we  update 
statistics  when  we  rerum  to  the  attribute  in  order  to  store  its  new  value.  However,  in  the 
case  of  the  mark  out-of-date  traversal,  we  do  not  return  to  the  object  and  hence  cannot 
store  an  updated  statistic.  In  this  case  we  use  an  alternate  worst  case  statistic  computed 
when  clustering  was  last  performed.  This  statistic  tells  how  many  disk  blocks  will  be 
visited  in  the  worst  case  (i.e.,  assuming  that  no  attributes  to  be  visited  are  already  marked 
out-of-date).  A  similar  worst  case  statistic  is  used  as  an  initial  estimate  for  the  dynami¬ 
cally  changing  decaying  averages. 

To  summarize  our  strategy  for  optimized  update,  we  treat  the  traversals  needed  to 
implement  attribute  evaluation  as  a  concurrent  computation.  This  allows  us  to  dynami¬ 
cally  choose  a  traversal  order  that  reduces  disk  access.  In  this  framework,  the  choice  of  a 


traversal  order  simplifies  to  the  choice  of  a  scheduling  order.  Sub-traversal  processes 
which  can  be  executed  without  disk  access  are  given  highest  scheduling  priority.  Once 
all  computations  that  can  be  performed  on  in-memory  data  have  been  completed  we 
choose  processes  which  have  the  smallest  expected  number  of  disk  accesses  to  run  first 
Expected  disk  accesses  are  measured  by  either  using  self-adaptive  past  performance 
statistics  in  the  form  of  a  decaying  average,  or  on  the  basis  of  worst  case  statistics  gath¬ 
ered  at  cluster  time. 

S3.  Triggers 

While  the  Cactis  data  model  is  powerful,  it  is  not  as  general  as  unconstrained  trigger 
mechanisms.  In  particular,  it  is  not  possible  to  use  the  attribute  evaluation  strategy  we 
have  discussed  to  directly  make  structural  changes  to  data.  Because  of  the  optimized 
implementation  of  derived  data  we  can  often,  with  a  little  thought,  do  without  structural 
changes  that  would  be  needed  under  other  models.  However,  in  the  cases  where  struc¬ 
tural  changes  are  required  we  must  use  a  scheme  equivalent  to  triggers.  This  scheme  is 
compatible  with  the  rest  of  the  attribute  evaluation  mechanism,  but  can  result  in  the  same 
performance  problems  and  unpredictable  effects  as  normal  triggers. 

To  implement  niggers  within  the  Cactis  model,  we  have  simply  introduced  attribute 
evaluation  rules  with  side  effects.  In  particular,  the  following  evaluation  rules  will 
implement  a  trigger  that  should  fire  when  a  certain  predicate  becomes  true. 

P  :=  pred(...); 

LastP  :=  Trigger(P,action_proc); 


Here  the  predicate  controlling  the  trigger  is  represented  by  the  function  pred  and  the  rou¬ 
tine  which  implements  the  action  of  the  trigger  is  represented  by  action_proc.  Function- 


ally,  the  Trigger  routine  simply  returns  the  value  of  P.  However,  it  also  "cheats"  by  look¬ 
ing  at  the  current  value  of  LastP.  If  the  value  of  LastP  is  about  to  change  from  false  to 
true  TriggerO  has  the  additional  side  effect  of  invoking  action_proc.  This  technique  can 
also  be  extended  by  including  a  retraction  action  to  be  executed  when  LastP  is  about  to 
change  from  true  to  false.  While  the  effects  of  the  action _proc  can  cause  an  inefficient 
chain  of  other  calculations  involving  triggers,  we  can  at  least  use  the  efficient  attribute 
evaluation  algorithm  to  decide  when  to  fire  particular  triggers. 

5.4.  Concurrency  Control 

Cactis  uses  a  timestamping  concurrency  control  technique  [43].  Because  of  the  pos¬ 
sibility  of  an  update  involving  a  long  chain  of  computations  which  touches  many  objects, 
a  locking  mechanism  was  judged  too  costly. 

Concurrency  control  in  Cactis  is  maintained  at  the  level  of  individual  attributes. 
This  allows  a  significant  amount  of  concurrency,  but  does  involve  the  potentially 
significant  space  overhead.  As  Cactis  is  intended  for  applications  with  large,  complex 
objects,  the  space  required  for  timestamps  may  not  prove  to  be  that  significant.  As  dis¬ 
cussed  in  Section  5,  each  attribute  has  associated  with  it  a  read  timestamp  (readtime) 
and  a  write  timestamp  (changetime).  To  support  rollback,  a  log  mechanism  was  imple¬ 
mented.  Standard  timestamp  logic  is  used;  when  a  read  or  write  conflict  occurs,  a 
transaction’s  log  is  deleted  and  the  transaction  is  restarted  with  a  new  timestamp. 

While  it  would  seem  that  the  transitive  attribute  dependencies  that  occur  in  a  Cactis 
database  would  require  more  than  local  timestamp  checks,  this  is  not  the  case.  Instead, 
ail  the  required  logic  to  properly  handle  transitive  dependencies  can  be  implemented  as  a 
part  of  the  normal  marking  and  evaluation  phases  of  the  attribute  evaluation  algorithm. 


If  a  value  is  transitively  needed  to  compute  another  value,  its  timestamps  will  be  checked 
as  a  part  of  the  (recursive)  evaluation  phase.  The  only  unusual  aspect  of  the  system  is 
that  a  request  to  read  a  value  may  cause  both  the  read  and  write  timestamps  to  be 
updated,  since  the  value  may  need  to  be  recomputed. 

In  this  section  we  have  looked  at  a  number  of  the  implementation  details  of  the 
Cactis  data  model  and  seen  how  concurrent,  self-adaptive  techniques  are  used  to  optim¬ 
ize  update.  In  the  next  section,  we  examine  the  Cactis  data  language. 

6.  The  Cactis  Data  Language 

Part  of  the  Cactis  implementation  effort  has  been  the  construction  of  an  object 
oriented  data  definition  language  (DDL)  for  the  data  model.  This  section  will  give  an 
overview  of  the  parts  of  this  language  which  deal  with  maintaining  functionally  derived 
data.  Details  of  the  language  constructs  supporting  the  Sembase  sub-system  are 
described  in  [15, 22]. 

6.L  Notation  and  Structure 

In  the  Cactis  data  model,  and  hence  the  Cactis  DDL,  information  is  structured  as 
objects.  These  objects  are  organized  into  a  type  hierarchy  using  a  multiple  inheritance 
type  system.  Individual  objects  encapsulate  a  series  of  named  and  typed  attribute  values, 
along  with  attribute  evaluation  rules  which  define  how  these  values  can  be  derived.  In 
addition,  objects  may  be  related  to  other  objects  by  means  of  typed  and  directed  relation¬ 
ships.  Information  about  the  object  may  be  exported  along  these  relationships.  Conse- 
quendy,  the  primary  interface  to  an  object  is  the  set  of  values  that  it  transmits  along  its 
relationships,  and  the  set  of  values  it  expects  to  be  transmitted  to  it,  along  those  relation- 


ships.  Although  relationships  are  directed  to  represent  semantic  concepts,  derived  data 
can  flow  in  both  directions  across  the  relationship. 

Since  relationships  in  the  Cacds  data  model  are  directed,  we  call  each  end  of  a  rela¬ 
tionship  either  a  black  or  a  white  connector.  In  the  Cactis  DDL,  relationship  types  are 
declared  by  stating  the  number,  type,  and  direction  of  each  of  the  values  that  flow  across 
a  relationship.  Values  may  flow  either  from  black  to  white,  or  from  white  to  black.  Once 
a  relationship  type  has  been  declared,  an  object  type  can  declare  that  it  possesses  a  black 
or  white  connector  of  that  relationship  type.  Two  objects  can  be  related  if  and  only  if  one 
possess  a  black  connector  for  a  given  relationship  type  and  the  other  possess  a  white  con¬ 
nector. 

As  an  example  of  the  use  of  the  Cactis  DDL,  we  will  describe  the  declaration  of 
objects  which  represent  milestones  in  a  software  project.  Milestones  are  a  good  example 
of  the  kind  of  highly  interrelated  data  that  the  Cactis  system  is  designed  to  handle.  A 
milestone  represents  the  scheduled  and  expected  completion  times  for  a  single  piece  of 
work  to  be  performed  in  the  project.  However,  this  piece  of  work  may  depend  on  the 
timely  completion  of  other  pieces  of  the  project.  Consequently,  changing  the  expected 
completion  time  of  a  single  milestone  (i.e.  slipping  a  deadline)  can  have  important  effects 
on  the  overall  project  We  cannot  simply  change  a  single  milestone  without  checking 
how  this  affects  other  milestones.  It  is  crucial  that  the  complete  effect  of  such  a  change 
be  derived  without  omission  or  error,  and  that  new  milestones  added  to  the  system  later 
can  also  automatically  take  such  changes  into  account. 

Figure  7,  contains  an  example  of  the  syntax  used  to  declare  the  relationships  and 
object  types  for  milestones  (we  have  modified  the  syntax  slightly  here  to  aid  exposition). 


Relationship  Type  milestone_dep  Multi  White 
Transmits 

exp_compl :  time  To  Black;  /*  expected  completion  time  comming  into  milestone  */ 

End; 


Relationship  Type  milestone_need  Multi  Black 
Transmits 

exp_compl :  dme  To  Black;  /*  expected  completion  time  going  out  of  milestone  */ 

End; 


/*  Object  to  simulate  many-many  relationship  for  milestone  dependencies  */ 
Object  Type  mstone_many 
Relationships 

from_obj  :  Black  milestone_need; 
to_obj  :  White  milestone_dep; 

Rules 

to_obj.exp_compl  :=  from_obj.exp_compl; 

End; 


Object  Type  milestone 
Relationships 
depends _on 
necded_by  : 
Attributes 
sched_compi 
locai_work 


:  Black  milestone_dep;  /*  things  this  milestone  depends  on  */ 

:  White  milestone_need;  /*  things  depending  on  this  milestone  */ 


sched_compi  :  time;  /*  originally  scheduled  completion  time  */ 

locai_work  :  time;  /*  time  to  complete  milestone  alone  */ 

expected_completion  :  time; 

Rules 

expected_completion 

;=  /*  sum  of  local  work  and  latest  out  of  things  depended  on  */ 
local  _work  +• 

Iterator  latest :  time 
Init  0 

For  Each  dep  In  depends_on  Do 

latest  :=  later_of(latest,  dep.exp_compl); 

End; 

needed_by.exp_compi  :=  expected_compIetion; 

End; 


Figure  7.  Milestone  Object  Types 
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Figure  8.  Milestone  Objects 

Figure  8,  shows  a  graphical  representation  of  several  example  objects  of  the  types 
defined  in  Figure  7  illustrating  how  they  would  be  connected  in  practice.  In  Figure  7,  we 
first  declare  two  relationship  types  milestone_dep  and  milestone_need.  These  relation¬ 
ships  taken  together  are  used  to  represent  the  many  to  many  relationship  of  one  milestone 
being  dependent  on  another.  Cactis  relationships  can  be  one  to  many,  but  not  many  to 
many.  To  implement  many  to  many  relationships  between  milestone  objects  we  use  an 
extra  object  of  type  mstone_raany.  This  approach  is  similar  to  the  strategy  used  to  imple¬ 
ment  the  set  construct  in  the  CODASYL  model.  As  we  have  declared  them,  each  rela¬ 
tionship  is  one  to  many.  We  use  objects  of  type  mstone_many  to  connect  this  pair  of  one 
to  many  relationships  into  a  many  to  many  relationship.  Note  that  these  relationships 
transmit  one  value:  exp_compl.  This  is  the  expected  completion  time  of  the  milestone 
being  depended  upon. 

Each  milestone  m  the  system  is  responsible  for  computing  its  own  expected  com¬ 
pletion  time  based  on  the  expected  completion  time  of  the  things  it  depends  on.  along 
with  internal  information  about  how  much  work  is  required  locally.  The  milestone  object 
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Object  Type  monitorcd_milestone  Subtype  of  milestone 


Attributes 

late  :  important  Boolean 
Rules 

late  :=  later_than(expected_completion,  sched_compI); 

End; 

Figure  9.  Monitored  Milestone 

type  shown  in  Figure  7  contains  attributes  and  attribute  evaluation  rules  to  accomplish 
this.  The  attributes  sched_compl  and  locai_work  are  intrinsic  attributes  representing  the 
originally  scheduled  completion  time  for  the  milestone  and  the  amount  of  time  needed  to 
complete  the  milestone,  respectively.  The  attribute  expected_completion  computes  the 
actual  expected  completion  time  for  the  item.  This  is  done  using  the  Iterator  facility  of 
the  Cactis  DDL  language  to  compute  the  latest  of  ail  the  milestones  that  the  current  mile¬ 
stone  depends  on.  This  latest  time  is  then  added  to  the  amount  of  local  work  to  be  done 
to  obtain  an  expected  completion  time  for  the  milestone  (we  have  slightly  simplified  the 
computation  here  by  not  properly  accounting  for  milestones  which  are  not  dependent  on 
other  milestones).  The  single  attribute  evaluation  rule  shown  here  can  be  used  to  keep 
the  expected  completion  times  for  ail  milestones  in  an  arbitrarily  large  project  up  to  date 
automatically. 

Once  we  have  created  an  object  type  such  as  the  milestone  type  shown  in  Figure  7. 
we  can  use  the  inheritance  facility  of  the  language  to  create  a  new  object  type: 
monitored_milestone,  which  adds  a  new  attribute  as  shown  in  Figure  9.  This  attribute 
explicitly  computes  whether  a  milestone  is  expected  to  be  late.  This  late  attribute  could 


be  used  to  alert  the  user  of  potential  unexpected  problems  when  changes  arc  made  to  the 
database.  Since  this  new  object  type  offers  a  black  milestone_dep  connector  and  a  white 
milestone_need  connector,  it  can  be  substituted  for  any  existing  milestone  object  without 
disturbing  the  existing  functionality  of  the  system.  In  this  case,  a  subtype  of  an  existing 
object  type  was  used.  However,  in  general,  objects  of  any  type  with  the  proper  connec¬ 
tors  could  be  substituted.  Further,  these  new  objects  could  be  integrated  with  existing 
milestone  objects  without  forcing  them  to  be  replaced.  This  is  an  example  of  how  the 
Cactis  data  model  is  well  suited  to  extending  the  functionality  of  an  existing  database 
while  still  supporting  existing  objects  and  functionality.  This  is  particularly  important 
for  application  domains  such  as  software  environments  where  we  expect  to  add  new 
objects  and  tools  to  the  system  during  its  use. 

In  addition  to  the  features  we  have  illustrated  above,  the  Cactis  DDL  also  offers  the 
ability  to  create  new  attribute  types  using  records,  arrays,  and  a  set  of  primitive  types 
such  as  stings,  characters,  integers,  booleans,  and  real  numbers.  Attribute  evaluation 
rules  are  constructed  using  expressions  built  from  a  subset  of  the  operators  of  the  C 
language  extended  with  the  iterator  shown  above,  as  well  as  constructs  for  computing 
array  and  record  valued  expressions.  Name  equivalence  typing  is  used  throughout. 
Finally,  the  system  provides  the  capability  to  invoke  user  supplied  functions  written  in  C, 
Pascal,  or  Fortran  so  that  complex  or  expensive  computations  can  be  handled  in  a  con¬ 
ventional  programming  language. 

6.2.  Implementation 

The  Cactis  DDL  translator  is  implemented  in  C  using  the  lex  and  yacc  compiler 
generation  tools.  It  creates  standard  C  code  and  data  structures  which  are  suitable  for 


input  to  the  Unix  C  compiler. 


Most  of  the  implementation  of  the  translator  uses  straightforward  compiler  tech¬ 
niques.  The  only  unusual  aspect  of  the  translator  is  the  way  it  treats  expressions.  As  dis¬ 
cussed  in  the  last  section,  the  optimized  update  algorithm  used  requires  that  expressions 
be  broken  into  "chunks"  which  can  be  scheduled  independendy.  The  translator  is  respon¬ 
sible  for  breaking  all  expressions  into  these  chunks.  This  is  done  using  a  general  strategy 
where  the  values  needed  to  compute  an  expression  are  all  requested  in  parallel,  then 
when  all  values  are  available,  the  expression  itself  is  computed.  This  involves  creating  a 
chunk  for  requesting  each  value,  and  a  chunk  for  evaluating  the  expression  itself.  In  the 
case  of  conditional  exprr  :sions  and  iterators,  the  expression  evaluation  chunk  is  further 
recursively  broken  into  request  and  evaluation  chunks.  In  ail  cases,  the  translator  com¬ 
piles  request  and  evaluation  chunks  in  such  a  way  that  the  proper  chunk  is  placed  on  the 
scheduling  queue  by  other  chunks  at  the  proper  time.  This  compile  time  analysis  allows 
the  run-time  scheduler  to  be  very  simple  and  efficient. 

7.  Performance  Tests 

In  this  section  we  discuss  a  number  of  performance  tests  that  have  been  run  on 
Cactis.  The  purpose  of  these  tests  was  not  to  benchmark  Cactis  against  any  known  stan¬ 
dard.  Rather,  we  wanted  to  illustrate  the  effectiveness  of.  two  things:  the  priority 
mechanism  of  the  scheduler  and  the  algorithm  used  to  cluster  databases  off-line.  It  is 
important  to  note  that  the  self-adaptive  nature  of  Cactis  assumes  that  there  is  a  certain 
amount  of  similarity  of  processing  requests  over  time  so  that  information  about  the  past 
is  in  some  way  predictive  of  the  future.  However,  even  with  little  similarity  in  query  his¬ 
tory,  the  scheduler  would  still  be  able  to  take  note  of  the  blocks  currently  in  memory  and 


give  scheduling  priority  to  processes  that  needed  these  blocks.  Also,  the  clusterer  would 
still  provide  some  benefits  over  randomly  placed  data. 

To  perform  the  tests,  we  constructed  test  databases  and  query  streams.  In  order  to 
simulate  repeated  query  sessions  on  one  database,  identical  query  streams  were  used. 
We  realize  that  this  is  somewhat  unrealistic  and  produces  results  that  are  better  than  they 
should  be.  But  it  was  not  possible  to  get  any  real  feel  for  how  much  similarity  would 
exist  from  one  session  to  another  in  an  actual  Cactis  application.  In  order  to  facilitate  the 
construction  of  test  databases,  a  database  generator  was  constructed.  It  is  described  in 
the  next  subsection,  then  following  that,  the  test  results  are  summarized. 

7.1.  The  Database  Generator 

The  database  generator  creates  random  schema,  data,  and  query  screams  tailored  to 
specific  parameters.  In  particular,  the  database  generator  can  be  used  to  create  random 
databases  in  which  aspects  such  as  interconnection  patterns,  overall  interconnection 
level,  and  number  of  relationship  cycles  in  the  data,  can  be  varied  under  user  control. 
This  has  allowed  us  to  test  the  Cactis  system  on  a  range  of  different  types  of  data. 

The  database  generator  accepts  six  pieces  of  information  from  the  user  to  describe 
the  characteristics  of  a  generated  database.  This  information  includes: 

Total_Size  - 

The  total  number  of  objects  in  the  resulting  database. 


Connected_Size  - 

The  size,  in  terms  of  number  of  objects,  of  each  connected  component  of  the  result¬ 
ing  database. 


Density  - 

A  factor  which  determines  the  overall  density  of  attributes  within  objects. 
Cycle^Bias  - 

A  bias  factor  which  determines  the  likelyhood  of  reladonship  cycles  occurring  in 
the  data. 

Important_Bias  - 


A  bias  factor  which  determines  the  percentage  of  attributes  in  the  data  which  will  be 
designated  important. 

Templates  - 

A  set  of  attribute  dependency  templates  which  determine  the  actual  structure  of 
attribute  dependencies  in  the  resulting  database. 


connected  empty; 

unconnected  :=  set  of  objects  to  be  connected; 

Ol  :=  random  object  removed  from  unconnected ; 

Add  0 1  to  connected ; 

Repeat 

01  :=  random  object  in  connected; 

If  randornO  >  Cycle_Bias  Then 
/*  no  cycle  created  */ 

02  :=  random  object  removed  from  unconnected; 
Add  02  to  connected; 

Else 

/*  cycle  created  */ 

02  :=  random  object  in  connected; 

Create  relationship  between  0 1  and  02; 

Until  unconnected  is  empty; 


Figure  10.  Random  Connection  Algorithm 


Repeat  Density  times 

P  :=  root  node  of  random  template  tree  chosen  from  Templates; 

O  :=  random  object  chosen  from  current  connected  component; 

Root_attr  :=  Layout(P.O); 

If  randomO  <  Important^Bias  Then  designate  Root_attr  important; 
Layout(P,0): 

Place  new  attribute  A  in  object  O; 

For  Each  Child:  P_child  of  P  Do 

02  :=  randomly  selected  object  related  to  O; 

B  :=  Layout(P_child.  02); 

Make  attribute  A  dependent  on  attribute  B; 

End; 

Create  attribute  evaluation  rule  for  A  which  reflects  dependencies  created; 
Return(A); 


Figure  1 1.  Random  Attribute  Layout  Algorithm 
The  generator  works  in  two  phases.  In  the  first  phase,  it  creates  data  objects  and 
connects  them  via  relationships.  In  the  second  phase  it  creates  attributes  to  be  placed  in 
these  objects,  creates  attribute  dependencies  to  relate  these  attributes,  and  generates  attri¬ 
bute  evaluation  rules  corresponding  to  these  dependencies. 

Generated  databases  are  partitioned  into  connected  components.  The  first  phase 
works  by  first  creating  Total  JSize/ConnectedJiiize  objects  to  be  placed  in  each  con¬ 
nected  component  of  the  resulting  database.  The  objects  in  each  such  component  are 
connected  via  relationships  using  the  algorithm  shown  in  Figure  10.  This  algorithm 
establishes  random  relationships  between  objects  in  the  connected  component.  The  pro¬ 
portion  of  these  relationships  that  represent  cycles  in  the  (at  this  point  undirected)  rela¬ 
tionship  graph  is  determined  by  the  Cycle_Bias  parameter. 
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After  the  objects  of  the  connected  components  are  created  and  related,  the  database 
generator  proceeds  to  place  attributes  in  these  objects.  This  is  done  using  the  templates 
provided  by  the  user.  These  templates  are  trees  which  indicate  how  a  set  of  randomly 
created  attributes  should  be  related  to  one  another.  The  system  uses  these  template  trees 
in  the  recursive  algorithm  shown  in  Figure  11  to  assign  attributes  and  attribute  dependen¬ 
cies  to  objects.  By  varying  the  nature  of  the  templates  used  along  with  the  CydeJBias 
parameter,  large  random  databases  can  be  generated  with  a  wide  range  of  connectivity 
characteristics. 

In  addidon  to  the  connection  characteristics  of  a  created  database,  the  user  can  also 
specify  the  characteristics  of  the  query  streams  generated  for  the  database.  In  particular, 
the  user  can  modify  the  percentage  of  reads  and  writes  along  with  the  length  of  query 
streams. 

7.2.  Test  Results 

In  the  performance  tests,  all  databases  were  of  size  100  objects.  The  size  of  each 
connected  component  varied  from  4%  to  30%.  The  density  was  set  to  a  default  medium 
value.  The  cycle  bias  was  ranged  from  low  to  medium,  to  high.  In  all  tests,  the  impor¬ 
tant  bias  was  at  30%.  The  template  was  a  simple  binary  tree. 

Cactis  is  intended  for  use  in  engineering  design  applications.  Thus,  we  assume  that, 
compared  to  a  traditional  business  database,  a  Cactis  database  would  have  fewer  and 
larger  objects,  and  would  have  a  significant  amount  of  derived  data.  For  this  reason,  we 
adjusted  the  sizes  of  the  objects  and  the  processing  buffers  so  that  only  a  handful  of 
objects  would  fit  in  memory  at  one  time.  Further,  we  tested  Cactis  with  a  wide  variety  of 
cycle  and  connectivity  settings. 


Figure  12. 


The  generated  query  streams  consisted  of  5%  updates.  Each  test  consisted  of  run¬ 
ning  the  same  query  stream  five  dmes,  then  reclustering.  Then  the  query  stream  was  run 
five  more  times,  and  reciustering  performed  again.  Finally,  the  query  stream  was  run  five 
more  times.  This  pattern  was  repeated  for  all  permutations  of  connectivity,  cycles,  and 
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for  each  of  the  following  scheduling  algorithms:  our  optimized  priority  algorithm,  and 
first  come  first  serve. 
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Figure  13. 
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Figures  12  and  13  illustrate  the  effectiveness  of  the  Cactis  scheduling  algorithm. 
Figure  12  contains  three  superimposed  graphs,  one  for  each  of  the  three  cycle  bias  set¬ 
tings  (marked  by  a  square,  a  circle,  and  a  diamond).  The  vertical  axis  represents  the  sav¬ 
ings  in  I/O  reads  when  the  priority  scheduler  is  used  instead  of  a  first  come,  first  serve 
algorithm.  The  savings  is  expressed  as  a  percentage  of  the  VO  count  derived  from  using 
the  first  come,  first  serve  algorithm.  Figure  13  is  a  similar  graph,  except  it  does  not  com¬ 
pare  first  come,  first  serve  with  the  priority  algorithm.  Rather,  it  compares  two  sessions 
which  both  use  priority  scheduling,  but  with  clustering  being  performed  in  between. 
These  graphs  obviously  exhibit  fluctuations,  but  do  indicate  clear  trends. 

These  results  indicate  four  general  patterns.  The  first  three  conclusions  refer  to  the 
databases  with  medium  and  high  cycles.  First,  for  very  low  connectivity,  the  Cactis 
scheduler  provides  only  a  small  improvement  Indeed,  Cactis  would  not  do  well  in  many 
traditional  business  environments,  where  there  is  very  little  derived  data.  Second,  for 
connectivities  in  the  12%  to  20%  range,  the  Cactis  scheduler  provides  a  substantial  sav¬ 
ings.  Third,  for  very  high  connectivities  (where  over  30%  of  the  database  is  intercon¬ 
nected)  Cacds  does  not  perform  well.  This  makes  sense;  with  a  large  chunk  of  the  data¬ 
base  interconnected,  it  is  impossible  to  keep  in  memory  a  working  set  of  blocks  which 
may  be  used  to  satisfy  a  number  of  requests. 

The  fourth  conclusion  is  that  for  all  databases  with  low  cycles,  the  Cactis  scheduler 
does  not  perform  as  well.  Again,  this  makes  sense  for  the  same  reason  very  low  connec¬ 
tivities  lead  to  bad  performance.  A  low  level  of  cycles  means  that  if  a  page  is  currently 
in  memory,  it  is  less  likely  to  be  needed  by  another  computation  currently  outstanding. 
Thus,  the  priority  scheduler  is  not  able  to  make  use  of  as  much  locality  of  reference  in 


selecting  the  tasks  to  perform.  However,  in  the  case  of  low  cycles,  the  clustercr  will  pro¬ 
vide  a  substantial  improvement  in  I/O  cost  (see  immediately  below). 

Figure  13  indicates  the  performance  of  Cactis  when  off-line  clustering  is  used 
between  two  database  sessions.  Cactis  was  run  twice  for  each  data  set,  and  both  times, 
priority  scheduling  was  used.  The  vertical  axis  shows  the  percent  savings  in  I/O  hits  as  a 
result  of  clustering.  The  results  may  be  summarized  very  simply.  Reclustering  the  data 
leads  to  a  very  substantial  savings  in  cost  in  ail  cases,  except  when  connectivity  is  very, 
very  low.  The  reason  for  this  is  clear:  with  little  connectivity,  there  isn’t  an  effective 
way  to  recluster. 

8.  Limitations,  Directions,  and  Conclusions 

The  current  Cactis  system  has  a  number  of  limitations  or  unimplemented  features. 
The  system  currendy  has  no  facilities  for  rollback  or  recovery  except  for  those  used  in 
concurrency  control.  The  data  model  does  support  an  efficient  undo  capability  which 
allows  transactions  to  be  rolled  back  and/or  reapplied,  however,  this  facility  only  works 
in  a  single  user  environment  and  not  in  a  multi-user  concurrent  environment.  This  facility 
uses  the  property  of  the  data  model  that  all  indirect  updates  automatically  performed  by 
changing  one  or  more  attribute  values  can  be  just  as  automatically  undone  simply  by  res¬ 
toring  the  old  value  (a  similar  property  holds  for  structural  changes).  Consequently,  the 
same  mechanism  used  to  derive  data,  can  also  "underive"  that  data  with  equal  ease.  This 
capability  is  very  useful  in  an  interactive  design  environment,  where  the  user  may  wish 
to  try  out  alternatives  and  explore  their  effects  without  permanently  committing  to  them. 
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A  philosophically  similar  capability  with  an  entirely  different  implementation 
approach  can  be  found  in  the  hypothetical  databases  approach  discussed  in  [39,40,44]. 
Systems  based  on  this  concept  us  a  differential  file  mechanism  to  explore  various  ver¬ 
sions  of  a  relational  database.  Currently  research  includes  work  on  extending  the  Cactis 
single  user  undo  system  to  operate  in  a  multi-user  environment. 

In  addition  to  the  lack  of  a  rollback  and  recovery  mechanism,  the  current  system 
also  does  not  yet  support  conventional  set-oriented  queries.  Also,  because  compiled  C 
functions  are  used  to  define  attribute  evaluations  rules,  (see  Section  5)  we  have  little 
flexibility  in  dynamically  changing  the  schema.  Finally,  the  system  does  not  provide 
authorization  and  security  facilities,  although  it  is  unclear  that  these  are  of  major  impor¬ 
tance  in  the  engineering  environments  Cactis  is  intended  to  support. 

As  we  have  stated,  the  Cactis  database  is  intended  to  support  applications  which 
require  a  rich  modeling  capability,  in  particular,  engineering  design  applications.  In 
order  to  test  the  effectiveness  of  the  Cactis  database  in  managing  such  complex  interre¬ 
lated  data,  we  have  begun  to  explore  its  use  in  the  support  of  application  areas  such  as 
software  environments  [19].  In  addition,  we  are  also  in  the  process  of  constructing  a  dis¬ 
tributed  version  of  Cactis,  with  this  effort  just  getting  under  way.  As  modem  software 
environments  will  most  likely  be  used  in  distributed  workstation  applications,  this  facility 
is  viewed  as  crucial.  It  will  be  necessary  to  allow  different  users  at  different  machines  to 
configure  their  own  environments  privately  and  share  information.  Cactis  is  well  suited 
to  this  task,  as  it  allows  the  end  user  to  conveniently  tailor  a  local  database.  Also,  the 
concurrent  implementation  of  Cactis  is  naturally  suited  to  a  parallel  or  distributed  system. 
In  this  way,  various  sub-traversals  may  actually  be  running  at  the  same  time.  Additional 


work  is  now  underway  to  support  replicated  data  in  a  distributed  environment.  Finally, 
research  is  now  underway  to  support  extensions  to  the  logical  data  model  and  physical 
data  model  to  improve  efficiency  and  expressiveness. 

To  conclude,  we  have  introduced  a  powerful  data  model  based,  on  derived  data. 
This  model  is  accompanied  by  a  very  simple  incremental  update  algorithm  which  lends 
itself  to  an  optimized  self-adaptive  physical  implementation  based  on  the  selective 
scheduling  of  concurrent  subtasks. 
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